[PATCH 0/3] mm: cma: /proc/cmainfo

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] mm: cma: /proc/cmainfo
@ 2014-12-26 14:39 Stefan I. Strogin
  2014-12-26 14:39 ` [PATCH 1/3] stacktrace: add seq_print_stack_trace() Stefan I. Strogin
                   ` (3 more replies)
  0 siblings, 4 replies; 39+ messages in thread
From: Stefan I. Strogin @ 2014-12-26 14:39 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Stefan I. Strogin, Joonsoo Kim, Andrew Morton, Marek Szyprowski,
	Michal Nazarewicz, aneesh.kumar, Laurent Pinchart, Dmitry Safonov,
	Pintu Kumar, Weijie Yang, Laura Abbott, SeongJae Park, Hui Zhu,
	Minchan Kim, Dyasly Sergey, Vyacheslav Tyrtov

Hello all,

Here is a patch set that adds /proc/cmainfo.

When compiled with CONFIG_CMA_DEBUG /proc/cmainfo will contain information
about about total, used, maximum free contiguous chunk and all currently
allocated contiguous buffers in CMA regions. The information about allocated
CMA buffers includes pid, comm, allocation latency and stacktrace at the
moment of allocation.

Example:

# cat /proc/cmainfo 
CMARegion stat:    65536 kB total,      248 kB used,    65216 kB max contiguous chunk

0x32400000 - 0x32401000 (4 kB), allocated by pid 63 (systemd-udevd), latency 74 us
 [<c1006e96>] dma_generic_alloc_coherent+0x86/0x160
 [<c13093af>] rpm_idle+0x1f/0x1f0
 [<c1006e10>] dma_generic_alloc_coherent+0x0/0x160
 [<f80a533e>] ohci_init+0x1fe/0x430 [ohci_hcd]
 [<c1006e10>] dma_generic_alloc_coherent+0x0/0x160
 [<f801404f>] ohci_pci_reset+0x4f/0x60 [ohci_pci]
 [<f80f165c>] usb_add_hcd+0x1fc/0x900 [usbcore]
 [<c1256158>] pcibios_set_master+0x38/0x90
 [<f8101ea6>] usb_hcd_pci_probe+0x176/0x4f0 [usbcore]
 [<c125852f>] pci_device_probe+0x6f/0xd0
 [<c1199495>] sysfs_create_link+0x25/0x50
 [<c1300522>] driver_probe_device+0x92/0x3b0
 [<c14564fb>] __mutex_lock_slowpath+0x5b/0x90
 [<c1300880>] __driver_attach+0x0/0x80
 [<c13008f9>] __driver_attach+0x79/0x80
 [<c1300880>] __driver_attach+0x0/0x80

0x32401000 - 0x32402000 (4 kB), allocated by pid 58 (systemd-udevd), latency 17 us
 [<c130e370>] dmam_coherent_release+0x0/0x90
 [<c112d76c>] __kmalloc_track_caller+0x31c/0x380
 [<c1006e96>] dma_generic_alloc_coherent+0x86/0x160
 [<c1006e10>] dma_generic_alloc_coherent+0x0/0x160
 [<c130e226>] dmam_alloc_coherent+0xb6/0x100
 [<f8125153>] ata_bmdma_port_start+0x43/0x60 [libata]
 [<f8113068>] ata_host_start.part.29+0xb8/0x190 [libata]
 [<c13624a0>] pci_read+0x30/0x40
 [<f8124eb9>] ata_pci_sff_activate_host+0x29/0x220 [libata]
 [<f8127050>] ata_bmdma_interrupt+0x0/0x1f0 [libata]
 [<c1256158>] pcibios_set_master+0x38/0x90
 [<f80ad9be>] piix_init_one+0x44e/0x630 [ata_piix]
 [<c1455ef0>] mutex_lock+0x10/0x20
 [<c1197093>] kernfs_activate+0x63/0xd0
 [<c11971c3>] kernfs_add_one+0xc3/0x130
 [<c125852f>] pci_device_probe+0x6f/0xd0
<...>

Dmitry Safonov (1):
  cma: add functions to get region pages counters

Stefan I. Strogin (2):
  stacktrace: add seq_print_stack_trace()
  mm: cma: introduce /proc/cmainfo

 include/linux/cma.h        |   2 +
 include/linux/stacktrace.h |   4 +
 kernel/stacktrace.c        |  17 ++++
 mm/cma.c                   | 236 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 259 insertions(+)

-- 
2.1.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 1/3] stacktrace: add seq_print_stack_trace()
  2014-12-26 14:39 [PATCH 0/3] mm: cma: /proc/cmainfo Stefan I. Strogin
@ 2014-12-26 14:39 ` Stefan I. Strogin
  2014-12-27  7:04   ` SeongJae Park
  2014-12-26 14:39 ` [PATCH 2/3] mm: cma: introduce /proc/cmainfo Stefan I. Strogin
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 39+ messages in thread
From: Stefan I. Strogin @ 2014-12-26 14:39 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Stefan I. Strogin, Joonsoo Kim, Andrew Morton, Marek Szyprowski,
	Michal Nazarewicz, aneesh.kumar, Laurent Pinchart, Dmitry Safonov,
	Pintu Kumar, Weijie Yang, Laura Abbott, SeongJae Park, Hui Zhu,
	Minchan Kim, Dyasly Sergey, Vyacheslav Tyrtov

Add a function seq_print_stack_trace() which prints stacktraces to seq_files.

Signed-off-by: Stefan I. Strogin <s.strogin@partner.samsung.com>
---
 include/linux/stacktrace.h |  4 ++++
 kernel/stacktrace.c        | 17 +++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/include/linux/stacktrace.h b/include/linux/stacktrace.h
index 669045a..6d62484 100644
--- a/include/linux/stacktrace.h
+++ b/include/linux/stacktrace.h
@@ -2,6 +2,7 @@
 #define __LINUX_STACKTRACE_H
 
 #include <linux/types.h>
+#include <linux/seq_file.h>
 
 struct task_struct;
 struct pt_regs;
@@ -24,6 +25,8 @@ extern void save_stack_trace_tsk(struct task_struct *tsk,
 extern void print_stack_trace(struct stack_trace *trace, int spaces);
 extern int snprint_stack_trace(char *buf, size_t size,
 			struct stack_trace *trace, int spaces);
+extern void seq_print_stack_trace(struct seq_file *m,
+			struct stack_trace *trace, int spaces);
 
 #ifdef CONFIG_USER_STACKTRACE_SUPPORT
 extern void save_stack_trace_user(struct stack_trace *trace);
@@ -37,6 +40,7 @@ extern void save_stack_trace_user(struct stack_trace *trace);
 # define save_stack_trace_user(trace)			do { } while (0)
 # define print_stack_trace(trace, spaces)		do { } while (0)
 # define snprint_stack_trace(buf, size, trace, spaces)	do { } while (0)
+# define seq_print_stack_trace(m, trace, spaces)	do { } while (0)
 #endif
 
 #endif
diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
index b6e4c16..66ef6f4 100644
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -57,6 +57,23 @@ int snprint_stack_trace(char *buf, size_t size,
 }
 EXPORT_SYMBOL_GPL(snprint_stack_trace);
 
+void seq_print_stack_trace(struct seq_file *m, struct stack_trace *trace,
+			int spaces)
+{
+	int i;
+
+	if (WARN_ON(!trace->entries))
+		return;
+
+	for (i = 0; i < trace->nr_entries; i++) {
+		unsigned long ip = trace->entries[i];
+
+		seq_printf(m, "%*c[<%p>] %pS\n", 1 + spaces, ' ',
+				(void *) ip, (void *) ip);
+	}
+}
+EXPORT_SYMBOL_GPL(seq_print_stack_trace);
+
 /*
  * Architectures that do not implement save_stack_trace_tsk or
  * save_stack_trace_regs get this weak alias and a once-per-bootup warning
-- 
2.1.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/3] stacktrace: add seq_print_stack_trace()
  2014-12-26 14:39 ` [PATCH 1/3] stacktrace: add seq_print_stack_trace() Stefan I. Strogin
@ 2014-12-27  7:04   ` SeongJae Park
  0 siblings, 0 replies; 39+ messages in thread
From: SeongJae Park @ 2014-12-27  7:04 UTC (permalink / raw)
  To: Stefan I. Strogin
  Cc: linux-mm, linux-kernel, Joonsoo Kim, Andrew Morton,
	Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Dmitry Safonov, Pintu Kumar, Weijie Yang,
	Laura Abbott, SeongJae Park, Hui Zhu, Minchan Kim, Dyasly Sergey,
	Vyacheslav Tyrtov



On Fri, 26 Dec 2014, Stefan I. Strogin wrote:

> Add a function seq_print_stack_trace() which prints stacktraces to seq_files.
>
> Signed-off-by: Stefan I. Strogin <s.strogin@partner.samsung.com>

Reviewed-by: SeongJae Park <sj38.park@gmail.com>

> ---
> include/linux/stacktrace.h |  4 ++++
> kernel/stacktrace.c        | 17 +++++++++++++++++
> 2 files changed, 21 insertions(+)
>
> diff --git a/include/linux/stacktrace.h b/include/linux/stacktrace.h
> index 669045a..6d62484 100644
> --- a/include/linux/stacktrace.h
> +++ b/include/linux/stacktrace.h
> @@ -2,6 +2,7 @@
> #define __LINUX_STACKTRACE_H
>
> #include <linux/types.h>
> +#include <linux/seq_file.h>
>
> struct task_struct;
> struct pt_regs;
> @@ -24,6 +25,8 @@ extern void save_stack_trace_tsk(struct task_struct *tsk,
> extern void print_stack_trace(struct stack_trace *trace, int spaces);
> extern int snprint_stack_trace(char *buf, size_t size,
> 			struct stack_trace *trace, int spaces);
> +extern void seq_print_stack_trace(struct seq_file *m,
> +			struct stack_trace *trace, int spaces);
>
> #ifdef CONFIG_USER_STACKTRACE_SUPPORT
> extern void save_stack_trace_user(struct stack_trace *trace);
> @@ -37,6 +40,7 @@ extern void save_stack_trace_user(struct stack_trace *trace);
> # define save_stack_trace_user(trace)			do { } while (0)
> # define print_stack_trace(trace, spaces)		do { } while (0)
> # define snprint_stack_trace(buf, size, trace, spaces)	do { } while (0)
> +# define seq_print_stack_trace(m, trace, spaces)	do { } while (0)
> #endif
>
> #endif
> diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
> index b6e4c16..66ef6f4 100644
> --- a/kernel/stacktrace.c
> +++ b/kernel/stacktrace.c
> @@ -57,6 +57,23 @@ int snprint_stack_trace(char *buf, size_t size,
> }
> EXPORT_SYMBOL_GPL(snprint_stack_trace);
>
> +void seq_print_stack_trace(struct seq_file *m, struct stack_trace *trace,
> +			int spaces)
> +{
> +	int i;
> +
> +	if (WARN_ON(!trace->entries))
> +		return;
> +
> +	for (i = 0; i < trace->nr_entries; i++) {
> +		unsigned long ip = trace->entries[i];
> +
> +		seq_printf(m, "%*c[<%p>] %pS\n", 1 + spaces, ' ',
> +				(void *) ip, (void *) ip);
> +	}
> +}
> +EXPORT_SYMBOL_GPL(seq_print_stack_trace);
> +
> /*
>  * Architectures that do not implement save_stack_trace_tsk or
>  * save_stack_trace_regs get this weak alias and a once-per-bootup warning
> -- 
> 2.1.0
>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 2/3] mm: cma: introduce /proc/cmainfo
  2014-12-26 14:39 [PATCH 0/3] mm: cma: /proc/cmainfo Stefan I. Strogin
  2014-12-26 14:39 ` [PATCH 1/3] stacktrace: add seq_print_stack_trace() Stefan I. Strogin
@ 2014-12-26 14:39 ` Stefan I. Strogin
  2014-12-26 16:02   ` Michal Nazarewicz
                     ` (2 more replies)
  2014-12-26 14:39 ` [PATCH 3/3] cma: add functions to get region pages counters Stefan I. Strogin
  2014-12-29  2:36 ` [PATCH 0/3] mm: cma: /proc/cmainfo Minchan Kim
  3 siblings, 3 replies; 39+ messages in thread
From: Stefan I. Strogin @ 2014-12-26 14:39 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Stefan I. Strogin, Joonsoo Kim, Andrew Morton, Marek Szyprowski,
	Michal Nazarewicz, aneesh.kumar, Laurent Pinchart, Dmitry Safonov,
	Pintu Kumar, Weijie Yang, Laura Abbott, SeongJae Park, Hui Zhu,
	Minchan Kim, Dyasly Sergey, Vyacheslav Tyrtov

/proc/cmainfo contains a list of currently allocated CMA buffers for every
CMA area when CONFIG_CMA_DEBUG is enabled.

Format is:

<base_phys_addr> - <end_phys_addr> (<size> kB), allocated by <PID>\
		(<command name>), latency <allocation latency> us
 <stack backtrace when the buffer had been allocated>

Signed-off-by: Stefan I. Strogin <s.strogin@partner.samsung.com>
---
 mm/cma.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 202 insertions(+)

diff --git a/mm/cma.c b/mm/cma.c
index a85ae28..ffaea26 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -34,6 +34,10 @@
 #include <linux/cma.h>
 #include <linux/highmem.h>
 #include <linux/io.h>
+#include <linux/list.h>
+#include <linux/proc_fs.h>
+#include <linux/uaccess.h>
+#include <linux/time.h>
 
 struct cma {
 	unsigned long	base_pfn;
@@ -41,8 +45,25 @@ struct cma {
 	unsigned long	*bitmap;
 	unsigned int order_per_bit; /* Order of pages represented by one bit */
 	struct mutex	lock;
+#ifdef CONFIG_CMA_DEBUG
+	struct list_head buffers_list;
+	struct mutex	list_lock;
+#endif
 };
 
+#ifdef CONFIG_CMA_DEBUG
+struct cma_buffer {
+	unsigned long pfn;
+	unsigned long count;
+	pid_t pid;
+	char comm[TASK_COMM_LEN];
+	unsigned int latency;
+	unsigned long trace_entries[16];
+	unsigned int nr_entries;
+	struct list_head list;
+};
+#endif
+
 static struct cma cma_areas[MAX_CMA_AREAS];
 static unsigned cma_area_count;
 static DEFINE_MUTEX(cma_mutex);
@@ -132,6 +153,10 @@ static int __init cma_activate_area(struct cma *cma)
 	} while (--i);
 
 	mutex_init(&cma->lock);
+#ifdef CONFIG_CMA_DEBUG
+	INIT_LIST_HEAD(&cma->buffers_list);
+	mutex_init(&cma->list_lock);
+#endif
 	return 0;
 
 err:
@@ -347,6 +372,86 @@ err:
 	return ret;
 }
 
+#ifdef CONFIG_CMA_DEBUG
+/**
+ * cma_buffer_list_add() - add a new entry to a list of allocated buffers
+ * @cma:     Contiguous memory region for which the allocation is performed.
+ * @pfn:     Base PFN of the allocated buffer.
+ * @count:   Number of allocated pages.
+ * @latency: Nanoseconds spent to allocate the buffer.
+ *
+ * This function adds a new entry to the list of allocated contiguous memory
+ * buffers in a CMA area. It uses the CMA area specificated by the device
+ * if available or the default global one otherwise.
+ */
+static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
+			       int count, s64 latency)
+{
+	struct cma_buffer *cmabuf;
+	struct stack_trace trace;
+
+	cmabuf = kmalloc(sizeof(struct cma_buffer), GFP_KERNEL);
+	if (!cmabuf)
+		return -ENOMEM;
+
+	trace.nr_entries = 0;
+	trace.max_entries = ARRAY_SIZE(cmabuf->trace_entries);
+	trace.entries = &cmabuf->trace_entries[0];
+	trace.skip = 2;
+	save_stack_trace(&trace);
+
+	cmabuf->pfn = pfn;
+	cmabuf->count = count;
+	cmabuf->pid = task_pid_nr(current);
+	cmabuf->nr_entries = trace.nr_entries;
+	get_task_comm(cmabuf->comm, current);
+	cmabuf->latency = (unsigned int) div_s64(latency, NSEC_PER_USEC);
+
+	mutex_lock(&cma->list_lock);
+	list_add_tail(&cmabuf->list, &cma->buffers_list);
+	mutex_unlock(&cma->list_lock);
+
+	return 0;
+}
+
+/**
+ * cma_buffer_list_del() - delete an entry from a list of allocated buffers
+ * @cma:   Contiguous memory region for which the allocation was performed.
+ * @pfn:   Base PFN of the released buffer.
+ *
+ * This function deletes a list entry added by cma_buffer_list_add().
+ */
+static void cma_buffer_list_del(struct cma *cma, unsigned long pfn)
+{
+	struct cma_buffer *cmabuf;
+
+	mutex_lock(&cma->list_lock);
+
+	list_for_each_entry(cmabuf, &cma->buffers_list, list)
+		if (cmabuf->pfn == pfn) {
+			list_del(&cmabuf->list);
+			kfree(cmabuf);
+			goto out;
+		}
+
+	pr_err("%s(pfn %lu): couldn't find buffers list entry\n",
+	       __func__, pfn);
+
+out:
+	mutex_unlock(&cma->list_lock);
+}
+#else
+static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
+			       int count, s64 latency)
+{
+	return 0;
+}
+
+static void cma_buffer_list_del(struct cma *cma, unsigned long pfn)
+{
+}
+#endif /* CONFIG_CMA_DEBUG */
+
 /**
  * cma_alloc() - allocate pages from contiguous area
  * @cma:   Contiguous memory region for which the allocation is performed.
@@ -361,11 +466,15 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
 	unsigned long mask, offset, pfn, start = 0;
 	unsigned long bitmap_maxno, bitmap_no, bitmap_count;
 	struct page *page = NULL;
+	struct timespec ts1, ts2;
+	s64 latency;
 	int ret;
 
 	if (!cma || !cma->count)
 		return NULL;
 
+	getnstimeofday(&ts1);
+
 	pr_debug("%s(cma %p, count %d, align %d)\n", __func__, (void *)cma,
 		 count, align);
 
@@ -413,6 +522,19 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
 		start = bitmap_no + mask + 1;
 	}
 
+	getnstimeofday(&ts2);
+	latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);
+
+	if (page) {
+		ret = cma_buffer_list_add(cma, pfn, count, latency);
+		if (ret) {
+			pr_warn("%s(): cma_buffer_list_add() returned %d\n",
+				__func__, ret);
+			cma_release(cma, page, count);
+			page = NULL;
+		}
+	}
+
 	pr_debug("%s(): returned %p\n", __func__, page);
 	return page;
 }
@@ -445,6 +567,86 @@ bool cma_release(struct cma *cma, struct page *pages, int count)
 
 	free_contig_range(pfn, count);
 	cma_clear_bitmap(cma, pfn, count);
+	cma_buffer_list_del(cma, pfn);
 
 	return true;
 }
+
+#ifdef CONFIG_CMA_DEBUG
+static void *s_start(struct seq_file *m, loff_t *pos)
+{
+	struct cma *cma = 0;
+
+	if (*pos == 0 && cma_area_count > 0)
+		cma = &cma_areas[0];
+	else
+		*pos = 0;
+
+	return cma;
+}
+
+static int s_show(struct seq_file *m, void *p)
+{
+	struct cma *cma = p;
+	struct cma_buffer *cmabuf;
+	struct stack_trace trace;
+
+	mutex_lock(&cma->list_lock);
+
+	list_for_each_entry(cmabuf, &cma->buffers_list, list) {
+		seq_printf(m, "0x%llx - 0x%llx (%lu kB), allocated by pid %u (%s), latency %u us\n",
+			   (unsigned long long)PFN_PHYS(cmabuf->pfn),
+			   (unsigned long long)PFN_PHYS(cmabuf->pfn +
+							cmabuf->count),
+			   (cmabuf->count * PAGE_SIZE) >> 10, cmabuf->pid,
+			   cmabuf->comm, cmabuf->latency);
+
+		trace.nr_entries = cmabuf->nr_entries;
+		trace.entries = &cmabuf->trace_entries[0];
+
+		seq_print_stack_trace(m, &trace, 0);
+		seq_putc(m, '\n');
+	}
+
+	mutex_unlock(&cma->list_lock);
+	return 0;
+}
+
+static void *s_next(struct seq_file *m, void *p, loff_t *pos)
+{
+	struct cma *cma = (struct cma *)p + 1;
+
+	return (cma < &cma_areas[cma_area_count]) ? cma : 0;
+}
+
+static void s_stop(struct seq_file *m, void *p)
+{
+}
+
+static const struct seq_operations cmainfo_op = {
+	.start = s_start,
+	.show = s_show,
+	.next = s_next,
+	.stop = s_stop,
+};
+
+static int cmainfo_open(struct inode *inode, struct file *file)
+{
+	return seq_open(file, &cmainfo_op);
+}
+
+static const struct file_operations proc_cmainfo_operations = {
+	.open = cmainfo_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = seq_release_private,
+};
+
+static int __init proc_cmainfo_init(void)
+{
+	proc_create("cmainfo", S_IRUSR, NULL, &proc_cmainfo_operations);
+	return 0;
+}
+
+module_init(proc_cmainfo_init);
+#endif /* CONFIG_CMA_DEBUG */
-- 
2.1.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] mm: cma: introduce /proc/cmainfo
  2014-12-26 14:39 ` [PATCH 2/3] mm: cma: introduce /proc/cmainfo Stefan I. Strogin
@ 2014-12-26 16:02   ` Michal Nazarewicz
  2014-12-29 14:09     ` Stefan Strogin
  2014-12-29 21:11   ` Laura Abbott
  2014-12-30  4:38   ` Joonsoo Kim
  2 siblings, 1 reply; 39+ messages in thread
From: Michal Nazarewicz @ 2014-12-26 16:02 UTC (permalink / raw)
  To: Stefan I. Strogin, linux-mm, linux-kernel
  Cc: Joonsoo Kim, Andrew Morton, Marek Szyprowski, aneesh.kumar,
	Laurent Pinchart, Dmitry Safonov, Pintu Kumar, Weijie Yang,
	Laura Abbott, SeongJae Park, Hui Zhu, Minchan Kim, Dyasly Sergey,
	Vyacheslav Tyrtov

On Fri, Dec 26 2014, "Stefan I. Strogin" <s.strogin@partner.samsung.com> wrote:
> /proc/cmainfo contains a list of currently allocated CMA buffers for every
> CMA area when CONFIG_CMA_DEBUG is enabled.
>
> Format is:
>
> <base_phys_addr> - <end_phys_addr> (<size> kB), allocated by <PID>\
> 		(<command name>), latency <allocation latency> us
>  <stack backtrace when the buffer had been allocated>
>
> Signed-off-by: Stefan I. Strogin <s.strogin@partner.samsung.com>
> ---
>  mm/cma.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 202 insertions(+)
>
> diff --git a/mm/cma.c b/mm/cma.c
> index a85ae28..ffaea26 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -34,6 +34,10 @@
>  #include <linux/cma.h>
>  #include <linux/highmem.h>
>  #include <linux/io.h>
> +#include <linux/list.h>
> +#include <linux/proc_fs.h>
> +#include <linux/uaccess.h>
> +#include <linux/time.h>
>  
>  struct cma {
>  	unsigned long	base_pfn;
> @@ -41,8 +45,25 @@ struct cma {
>  	unsigned long	*bitmap;
>  	unsigned int order_per_bit; /* Order of pages represented by one bit */
>  	struct mutex	lock;
> +#ifdef CONFIG_CMA_DEBUG
> +	struct list_head buffers_list;
> +	struct mutex	list_lock;
> +#endif
>  };
>  
> +#ifdef CONFIG_CMA_DEBUG
> +struct cma_buffer {
> +	unsigned long pfn;
> +	unsigned long count;
> +	pid_t pid;
> +	char comm[TASK_COMM_LEN];
> +	unsigned int latency;
> +	unsigned long trace_entries[16];
> +	unsigned int nr_entries;
> +	struct list_head list;
> +};
> +#endif
> +
>  static struct cma cma_areas[MAX_CMA_AREAS];
>  static unsigned cma_area_count;
>  static DEFINE_MUTEX(cma_mutex);
> @@ -132,6 +153,10 @@ static int __init cma_activate_area(struct cma *cma)
>  	} while (--i);
>  
>  	mutex_init(&cma->lock);
> +#ifdef CONFIG_CMA_DEBUG
> +	INIT_LIST_HEAD(&cma->buffers_list);
> +	mutex_init(&cma->list_lock);
> +#endif
>  	return 0;
>  
>  err:
> @@ -347,6 +372,86 @@ err:
>  	return ret;
>  }
>  
> +#ifdef CONFIG_CMA_DEBUG
> +/**
> + * cma_buffer_list_add() - add a new entry to a list of allocated buffers
> + * @cma:     Contiguous memory region for which the allocation is performed.
> + * @pfn:     Base PFN of the allocated buffer.
> + * @count:   Number of allocated pages.
> + * @latency: Nanoseconds spent to allocate the buffer.
> + *
> + * This function adds a new entry to the list of allocated contiguous memory
> + * buffers in a CMA area. It uses the CMA area specificated by the device
> + * if available or the default global one otherwise.
> + */
> +static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
> +			       int count, s64 latency)
> +{
> +	struct cma_buffer *cmabuf;
> +	struct stack_trace trace;
> +
> +	cmabuf = kmalloc(sizeof(struct cma_buffer), GFP_KERNEL);

	cmabuf = kmalloc(sizeof *cmabuf, GFP_KERNEL);

> +	if (!cmabuf)
> +		return -ENOMEM;
> +
> +	trace.nr_entries = 0;
> +	trace.max_entries = ARRAY_SIZE(cmabuf->trace_entries);
> +	trace.entries = &cmabuf->trace_entries[0];
> +	trace.skip = 2;
> +	save_stack_trace(&trace);
> +
> +	cmabuf->pfn = pfn;
> +	cmabuf->count = count;
> +	cmabuf->pid = task_pid_nr(current);
> +	cmabuf->nr_entries = trace.nr_entries;
> +	get_task_comm(cmabuf->comm, current);
> +	cmabuf->latency = (unsigned int) div_s64(latency, NSEC_PER_USEC);
> +
> +	mutex_lock(&cma->list_lock);
> +	list_add_tail(&cmabuf->list, &cma->buffers_list);
> +	mutex_unlock(&cma->list_lock);
> +
> +	return 0;
> +}
> +
> +/**
> + * cma_buffer_list_del() - delete an entry from a list of allocated buffers
> + * @cma:   Contiguous memory region for which the allocation was performed.
> + * @pfn:   Base PFN of the released buffer.
> + *
> + * This function deletes a list entry added by cma_buffer_list_add().
> + */
> +static void cma_buffer_list_del(struct cma *cma, unsigned long pfn)
> +{
> +	struct cma_buffer *cmabuf;
> +
> +	mutex_lock(&cma->list_lock);
> +
> +	list_for_each_entry(cmabuf, &cma->buffers_list, list)
> +		if (cmabuf->pfn == pfn) {
> +			list_del(&cmabuf->list);
> +			kfree(cmabuf);
> +			goto out;
> +		}

You do not have guarantee that CMA deallocations will match allocations
exactly.  User may allocate CMA region and then free it chunks.  I'm not
saying that the debug code must handle than case but at least I would
like to see a comment describing this shortcoming.

> +
> +	pr_err("%s(pfn %lu): couldn't find buffers list entry\n",
> +	       __func__, pfn);
> +
> +out:
> +	mutex_unlock(&cma->list_lock);
> +}
> +#else
> +static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
> +			       int count, s64 latency)
> +{
> +	return 0;
> +}
> +
> +static void cma_buffer_list_del(struct cma *cma, unsigned long pfn)
> +{
> +}
> +#endif /* CONFIG_CMA_DEBUG */
> +
>  /**
>   * cma_alloc() - allocate pages from contiguous area
>   * @cma:   Contiguous memory region for which the allocation is performed.
> @@ -361,11 +466,15 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
>  	unsigned long mask, offset, pfn, start = 0;
>  	unsigned long bitmap_maxno, bitmap_no, bitmap_count;
>  	struct page *page = NULL;
> +	struct timespec ts1, ts2;
> +	s64 latency;
>  	int ret;
>  
>  	if (!cma || !cma->count)
>  		return NULL;
>  
> +	getnstimeofday(&ts1);
> +

If CMA_DEBUG is disabled, you waste time on measuring latency.  Either
use #ifdef or IS_ENABLED, e.g.:

	if (IS_ENABLED(CMA_DEBUG))
		getnstimeofday(&ts1);

>  	pr_debug("%s(cma %p, count %d, align %d)\n", __func__, (void *)cma,
>  		 count, align);
>  
> @@ -413,6 +522,19 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
>  		start = bitmap_no + mask + 1;
>  	}
>  
> +	getnstimeofday(&ts2);
> +	latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);
> +
> +	if (page) {

	if (IS_ENABLED(CMA_DEBUG) && page) {
		getnstimeofday(&ts2);
		latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);

> +		ret = cma_buffer_list_add(cma, pfn, count, latency);

You could also change cma_buffer_list_add to take ts1 as an argument
instead of latency and then latency calculating would be hidden inside
of that function.  Initialising ts1 should still be guarded with
IS_ENABLED of course.

> +		if (ret) {
> +			pr_warn("%s(): cma_buffer_list_add() returned %d\n",
> +				__func__, ret);
> +			cma_release(cma, page, count);
> +			page = NULL;

Harsh, but ok, if you want.

> +		}
> +	}
> +
>  	pr_debug("%s(): returned %p\n", __func__, page);
>  	return page;
>  }
> @@ -445,6 +567,86 @@ bool cma_release(struct cma *cma, struct page *pages, int count)
>  
>  	free_contig_range(pfn, count);
>  	cma_clear_bitmap(cma, pfn, count);
> +	cma_buffer_list_del(cma, pfn);
>  
>  	return true;
>  }
> +
> +#ifdef CONFIG_CMA_DEBUG
> +static void *s_start(struct seq_file *m, loff_t *pos)
> +{
> +	struct cma *cma = 0;
> +
> +	if (*pos == 0 && cma_area_count > 0)
> +		cma = &cma_areas[0];
> +	else
> +		*pos = 0;
> +
> +	return cma;
> +}
> +
> +static int s_show(struct seq_file *m, void *p)
> +{
> +	struct cma *cma = p;
> +	struct cma_buffer *cmabuf;
> +	struct stack_trace trace;
> +
> +	mutex_lock(&cma->list_lock);
> +
> +	list_for_each_entry(cmabuf, &cma->buffers_list, list) {
> +		seq_printf(m, "0x%llx - 0x%llx (%lu kB), allocated by pid %u (%s), latency %u us\n",
> +			   (unsigned long long)PFN_PHYS(cmabuf->pfn),
> +			   (unsigned long long)PFN_PHYS(cmabuf->pfn +
> +							cmabuf->count),
> +			   (cmabuf->count * PAGE_SIZE) >> 10, cmabuf->pid,
> +			   cmabuf->comm, cmabuf->latency);
> +
> +		trace.nr_entries = cmabuf->nr_entries;
> +		trace.entries = &cmabuf->trace_entries[0];
> +
> +		seq_print_stack_trace(m, &trace, 0);
> +		seq_putc(m, '\n');
> +	}
> +
> +	mutex_unlock(&cma->list_lock);
> +	return 0;
> +}
> +
> +static void *s_next(struct seq_file *m, void *p, loff_t *pos)
> +{
> +	struct cma *cma = (struct cma *)p + 1;
> +
> +	return (cma < &cma_areas[cma_area_count]) ? cma : 0;
> +}
> +
> +static void s_stop(struct seq_file *m, void *p)
> +{
> +}
> +
> +static const struct seq_operations cmainfo_op = {
> +	.start = s_start,
> +	.show = s_show,
> +	.next = s_next,
> +	.stop = s_stop,
> +};
> +
> +static int cmainfo_open(struct inode *inode, struct file *file)
> +{
> +	return seq_open(file, &cmainfo_op);
> +}
> +
> +static const struct file_operations proc_cmainfo_operations = {
> +	.open = cmainfo_open,
> +	.read = seq_read,
> +	.llseek = seq_lseek,
> +	.release = seq_release_private,
> +};
> +
> +static int __init proc_cmainfo_init(void)
> +{
> +	proc_create("cmainfo", S_IRUSR, NULL, &proc_cmainfo_operations);
> +	return 0;
> +}
> +
> +module_init(proc_cmainfo_init);
> +#endif /* CONFIG_CMA_DEBUG */
> -- 
> 2.1.0
>

-- 
Best regards,                                         _     _
.o. | Liege of Serenely Enlightened Majesty of      o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz    (o o)
ooo +--<mpn@google.com>--<xmpp:mina86@jabber.org>--ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] mm: cma: introduce /proc/cmainfo
  2014-12-26 16:02   ` Michal Nazarewicz
@ 2014-12-29 14:09     ` Stefan Strogin
  2014-12-29 17:26       ` Michal Nazarewicz
  2014-12-31  1:14       ` Gioh Kim
  0 siblings, 2 replies; 39+ messages in thread
From: Stefan Strogin @ 2014-12-29 14:09 UTC (permalink / raw)
  To: Michal Nazarewicz, Stefan I. Strogin, linux-mm, linux-kernel
  Cc: Joonsoo Kim, Andrew Morton, Marek Szyprowski, aneesh.kumar,
	Laurent Pinchart, Pintu Kumar, Weijie Yang, Laura Abbott, Hui Zhu,
	Minchan Kim, Dyasly Sergey, Vyacheslav Tyrtov, Stefan Strogin

Thanks for review MichaA?,

On 12/26/2014 07:02 PM, Michal Nazarewicz wrote:
> On Fri, Dec 26 2014, "Stefan I. Strogin" <s.strogin@partner.samsung.com> wrote:
>> /proc/cmainfo contains a list of currently allocated CMA buffers for every
>> CMA area when CONFIG_CMA_DEBUG is enabled.
>>
>> Format is:
>>
>> <base_phys_addr> - <end_phys_addr> (<size> kB), allocated by <PID>\
>> 		(<command name>), latency <allocation latency> us
>>   <stack backtrace when the buffer had been allocated>
>>
>> Signed-off-by: Stefan I. Strogin <s.strogin@partner.samsung.com>
>> ---
>>   mm/cma.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 202 insertions(+)
>>
>> diff --git a/mm/cma.c b/mm/cma.c
>> index a85ae28..ffaea26 100644
>> --- a/mm/cma.c
>> +++ b/mm/cma.c
>> @@ -347,6 +372,86 @@ err:
>>   	return ret;
>>   }
>>
>> +#ifdef CONFIG_CMA_DEBUG
>> +/**
>> + * cma_buffer_list_add() - add a new entry to a list of allocated buffers
>> + * @cma:     Contiguous memory region for which the allocation is performed.
>> + * @pfn:     Base PFN of the allocated buffer.
>> + * @count:   Number of allocated pages.
>> + * @latency: Nanoseconds spent to allocate the buffer.
>> + *
>> + * This function adds a new entry to the list of allocated contiguous memory
>> + * buffers in a CMA area. It uses the CMA area specificated by the device
>> + * if available or the default global one otherwise.
>> + */
>> +static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
>> +			       int count, s64 latency)
>> +{
>> +	struct cma_buffer *cmabuf;
>> +	struct stack_trace trace;
>> +
>> +	cmabuf = kmalloc(sizeof(struct cma_buffer), GFP_KERNEL);
>
> 	cmabuf = kmalloc(sizeof *cmabuf, GFP_KERNEL);

	cmabuf = kmalloc(sizeof(*cmabuf), GFP_KERNEL);

>
>> +	if (!cmabuf)
>> +		return -ENOMEM;
>> +
>> +	trace.nr_entries = 0;
>> +	trace.max_entries = ARRAY_SIZE(cmabuf->trace_entries);
>> +	trace.entries = &cmabuf->trace_entries[0];
>> +	trace.skip = 2;
>> +	save_stack_trace(&trace);
>> +
>> +	cmabuf->pfn = pfn;
>> +	cmabuf->count = count;
>> +	cmabuf->pid = task_pid_nr(current);
>> +	cmabuf->nr_entries = trace.nr_entries;
>> +	get_task_comm(cmabuf->comm, current);
>> +	cmabuf->latency = (unsigned int) div_s64(latency, NSEC_PER_USEC);
>> +
>> +	mutex_lock(&cma->list_lock);
>> +	list_add_tail(&cmabuf->list, &cma->buffers_list);
>> +	mutex_unlock(&cma->list_lock);
>> +
>> +	return 0;
>> +}
>> +
>> +/**
>> + * cma_buffer_list_del() - delete an entry from a list of allocated buffers
>> + * @cma:   Contiguous memory region for which the allocation was performed.
>> + * @pfn:   Base PFN of the released buffer.
>> + *
>> + * This function deletes a list entry added by cma_buffer_list_add().
>> + */
>> +static void cma_buffer_list_del(struct cma *cma, unsigned long pfn)
>> +{
>> +	struct cma_buffer *cmabuf;
>> +
>> +	mutex_lock(&cma->list_lock);
>> +
>> +	list_for_each_entry(cmabuf, &cma->buffers_list, list)
>> +		if (cmabuf->pfn == pfn) {
>> +			list_del(&cmabuf->list);
>> +			kfree(cmabuf);
>> +			goto out;
>> +		}
>
> You do not have guarantee that CMA deallocations will match allocations
> exactly.  User may allocate CMA region and then free it chunks.  I'm not
> saying that the debug code must handle than case but at least I would
> like to see a comment describing this shortcoming.

Thanks, I'll fix it. If a number of released pages is less than there
were allocated then the list entry shouldn't be deleted, but it's fields
should be updated.

>
>> @@ -361,11 +466,15 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
>>   	unsigned long mask, offset, pfn, start = 0;
>>   	unsigned long bitmap_maxno, bitmap_no, bitmap_count;
>>   	struct page *page = NULL;
>> +	struct timespec ts1, ts2;
>> +	s64 latency;
>>   	int ret;
>>
>>   	if (!cma || !cma->count)
>>   		return NULL;
>>
>> +	getnstimeofday(&ts1);
>> +
>
> If CMA_DEBUG is disabled, you waste time on measuring latency.  Either
> use #ifdef or IS_ENABLED, e.g.:
>
> 	if (IS_ENABLED(CMA_DEBUG))
> 		getnstimeofday(&ts1);

Obviously! :)

>
>> @@ -413,6 +522,19 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
>>   		start = bitmap_no + mask + 1;
>>   	}
>>
>> +	getnstimeofday(&ts2);
>> +	latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);
>> +
>> +	if (page) {
>
> 	if (IS_ENABLED(CMA_DEBUG) && page) {
> 		getnstimeofday(&ts2);
> 		latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);
>
>> +		ret = cma_buffer_list_add(cma, pfn, count, latency);
>
> You could also change cma_buffer_list_add to take ts1 as an argument
> instead of latency and then latency calculating would be hidden inside
> of that function.  Initialising ts1 should still be guarded with
> IS_ENABLED of course.

	if (IS_ENABLED(CMA_DEBUG) && page) {
		getnstimeofday(&ts2);
		latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);

It seem to me this variant is better readable, thanks.

>
>> +		if (ret) {
>> +			pr_warn("%s(): cma_buffer_list_add() returned %d\n",
>> +				__func__, ret);
>> +			cma_release(cma, page, count);
>> +			page = NULL;
>
> Harsh, but ok, if you want.

Excuse me, maybe you could suggest how to make a nicer fallback?
Or sure OK?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] mm: cma: introduce /proc/cmainfo
  2014-12-29 14:09     ` Stefan Strogin
@ 2014-12-29 17:26       ` Michal Nazarewicz
  2014-12-31  1:14       ` Gioh Kim
  1 sibling, 0 replies; 39+ messages in thread
From: Michal Nazarewicz @ 2014-12-29 17:26 UTC (permalink / raw)
  To: Stefan Strogin, Stefan I. Strogin, linux-mm, linux-kernel
  Cc: Joonsoo Kim, Andrew Morton, Marek Szyprowski, aneesh.kumar,
	Laurent Pinchart, Pintu Kumar, Weijie Yang, Laura Abbott, Hui Zhu,
	Minchan Kim, Dyasly Sergey, Vyacheslav Tyrtov

>> On Fri, Dec 26 2014, "Stefan I. Strogin" <s.strogin@partner.samsung.com> wrote:
>>> +		if (ret) {
>>> +			pr_warn("%s(): cma_buffer_list_add() returned %d\n",
>>> +				__func__, ret);
>>> +			cma_release(cma, page, count);
>>> +			page = NULL;

> On 12/26/2014 07:02 PM, Michal Nazarewicz wrote:
>> Harsh, but ok, if you want.

On Mon, Dec 29 2014, Stefan Strogin wrote:
> Excuse me, maybe you could suggest how to make a nicer fallback?
> Or sure OK?

I would leave the allocation succeed and print warning that the debug
information is invalid.  You could have a “dirty” flag which is set if
that happens (or on a partial release discussed earlier) which, if set,
would add “Some debug information missing” message at the beginning of
the procfs file.  In my opinion CMA succeeding is more important than
having correct debug information.

-- 
Best regards,                                         _     _
.o. | Liege of Serenely Enlightened Majesty of      o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz    (o o)
ooo +--<mpn@google.com>--<xmpp:mina86@jabber.org>--ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] mm: cma: introduce /proc/cmainfo
  2014-12-29 14:09     ` Stefan Strogin
  2014-12-29 17:26       ` Michal Nazarewicz
@ 2014-12-31  1:14       ` Gioh Kim
  2015-01-23 12:32         ` Stefan Strogin
  1 sibling, 1 reply; 39+ messages in thread
From: Gioh Kim @ 2014-12-31  1:14 UTC (permalink / raw)
  To: Stefan Strogin, Michal Nazarewicz, Stefan I. Strogin, linux-mm,
	linux-kernel
  Cc: Joonsoo Kim, Andrew Morton, Marek Szyprowski, aneesh.kumar,
	Laurent Pinchart, Pintu Kumar, Weijie Yang, Laura Abbott, Hui Zhu,
	Minchan Kim, Dyasly Sergey, Vyacheslav Tyrtov



2014-12-29 i??i?? 11:09i?? Stefan Strogin i?'(e??) i?' e,?:
> Thanks for review MichaA?,
>
> On 12/26/2014 07:02 PM, Michal Nazarewicz wrote:
>> On Fri, Dec 26 2014, "Stefan I. Strogin" <s.strogin@partner.samsung.com> wrote:
>>> /proc/cmainfo contains a list of currently allocated CMA buffers for every
>>> CMA area when CONFIG_CMA_DEBUG is enabled.
>>>
>>> Format is:
>>>
>>> <base_phys_addr> - <end_phys_addr> (<size> kB), allocated by <PID>\
>>>         (<command name>), latency <allocation latency> us
>>>   <stack backtrace when the buffer had been allocated>
>>>
>>> Signed-off-by: Stefan I. Strogin <s.strogin@partner.samsung.com>
>>> ---
>>>   mm/cma.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 202 insertions(+)
>>>
>>> diff --git a/mm/cma.c b/mm/cma.c
>>> index a85ae28..ffaea26 100644
>>> --- a/mm/cma.c
>>> +++ b/mm/cma.c
>>> @@ -347,6 +372,86 @@ err:
>>>       return ret;
>>>   }
>>>
>>> +#ifdef CONFIG_CMA_DEBUG
>>> +/**
>>> + * cma_buffer_list_add() - add a new entry to a list of allocated buffers
>>> + * @cma:     Contiguous memory region for which the allocation is performed.
>>> + * @pfn:     Base PFN of the allocated buffer.
>>> + * @count:   Number of allocated pages.
>>> + * @latency: Nanoseconds spent to allocate the buffer.
>>> + *
>>> + * This function adds a new entry to the list of allocated contiguous memory
>>> + * buffers in a CMA area. It uses the CMA area specificated by the device
>>> + * if available or the default global one otherwise.
>>> + */
>>> +static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
>>> +                   int count, s64 latency)
>>> +{
>>> +    struct cma_buffer *cmabuf;
>>> +    struct stack_trace trace;
>>> +
>>> +    cmabuf = kmalloc(sizeof(struct cma_buffer), GFP_KERNEL);
>>
>>     cmabuf = kmalloc(sizeof *cmabuf, GFP_KERNEL);
>
>      cmabuf = kmalloc(sizeof(*cmabuf), GFP_KERNEL);
>
>>
>>> +    if (!cmabuf)
>>> +        return -ENOMEM;
>>> +
>>> +    trace.nr_entries = 0;
>>> +    trace.max_entries = ARRAY_SIZE(cmabuf->trace_entries);
>>> +    trace.entries = &cmabuf->trace_entries[0];
>>> +    trace.skip = 2;
>>> +    save_stack_trace(&trace);
>>> +
>>> +    cmabuf->pfn = pfn;
>>> +    cmabuf->count = count;
>>> +    cmabuf->pid = task_pid_nr(current);
>>> +    cmabuf->nr_entries = trace.nr_entries;
>>> +    get_task_comm(cmabuf->comm, current);
>>> +    cmabuf->latency = (unsigned int) div_s64(latency, NSEC_PER_USEC);
>>> +
>>> +    mutex_lock(&cma->list_lock);
>>> +    list_add_tail(&cmabuf->list, &cma->buffers_list);
>>> +    mutex_unlock(&cma->list_lock);
>>> +
>>> +    return 0;
>>> +}

Is it ok if the information is too big?
I'm not sure but I remember that seq_printf has 4K limitation.

So I made seq_operations with seq_list_start/next functions.

EX)

static void *debug_seq_start(struct seq_file *s, loff_t *pos)
{
A>>       mutex_lock(&debug_lock);
A>>       return seq_list_start(&debug_list, *pos);
}	

static void debug_seq_stop(struct seq_file *s, void *data)
{
A>>       struct debug_header *header = data;

A>>       if (header == NULL || &header->head_list == &debug_list) {
A>>       A>>       seq_printf(s, "end of info");
A>>       }

A>>       mutex_unlock(&debug_lock);
}

static void *debug_seq_next(struct seq_file *s, void *data, loff_t *pos)
{
A>>       return seq_list_next(data, &debug_list, pos);
}

static int debug_seq_show(struct seq_file *sfile, void *data)
{
A>>       struct debug_header *header;
A>>       char *p;

A>>       header= list_entry(data,
A>>       A>>       A>>          struct debug_header,	
A>>       A>>       A>>          head_list);

A>>       seq_printf(sfile, "print info");
A>>       return 0;
}
static const struct seq_operations debug_seq_ops = {
A>>       .start = debug_seq_start,	
A>>       .next = debug_seq_next,	
A>>       .stop = debug_seq_stop,	
A>>       .show = debug_seq_show,	
};

>> You do not have guarantee that CMA deallocations will match allocations
>> exactly.  User may allocate CMA region and then free it chunks.  I'm not
>> saying that the debug code must handle than case but at least I would
>> like to see a comment describing this shortcoming.
>
> Thanks, I'll fix it. If a number of released pages is less than there
> were allocated then the list entry shouldn't be deleted, but it's fields
> should be updated.
>
>>
>>> @@ -361,11 +466,15 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
>>>       unsigned long mask, offset, pfn, start = 0;
>>>       unsigned long bitmap_maxno, bitmap_no, bitmap_count;
>>>       struct page *page = NULL;
>>> +    struct timespec ts1, ts2;
>>> +    s64 latency;
>>>       int ret;
>>>
>>>       if (!cma || !cma->count)
>>>           return NULL;
>>>
>>> +    getnstimeofday(&ts1);
>>> +
>>
>> If CMA_DEBUG is disabled, you waste time on measuring latency.  Either
>> use #ifdef or IS_ENABLED, e.g.:
>>
>>     if (IS_ENABLED(CMA_DEBUG))
>>         getnstimeofday(&ts1);
>
> Obviously! :)
>
>>
>>> @@ -413,6 +522,19 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
>>>           start = bitmap_no + mask + 1;
>>>       }
>>>
>>> +    getnstimeofday(&ts2);
>>> +    latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);
>>> +
>>> +    if (page) {
>>
>>     if (IS_ENABLED(CMA_DEBUG) && page) {
>>         getnstimeofday(&ts2);
>>         latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);
>>
>>> +        ret = cma_buffer_list_add(cma, pfn, count, latency);
>>
>> You could also change cma_buffer_list_add to take ts1 as an argument
>> instead of latency and then latency calculating would be hidden inside
>> of that function.  Initialising ts1 should still be guarded with
>> IS_ENABLED of course.
>
>      if (IS_ENABLED(CMA_DEBUG) && page) {
>          getnstimeofday(&ts2);
>          latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);
>
> It seem to me this variant is better readable, thanks.
>
>>
>>> +        if (ret) {
>>> +            pr_warn("%s(): cma_buffer_list_add() returned %d\n",
>>> +                __func__, ret);
>>> +            cma_release(cma, page, count);
>>> +            page = NULL;
>>
>> Harsh, but ok, if you want.
>
> Excuse me, maybe you could suggest how to make a nicer fallback?
> Or sure OK?
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] mm: cma: introduce /proc/cmainfo
  2014-12-31  1:14       ` Gioh Kim
@ 2015-01-23 12:32         ` Stefan Strogin
  0 siblings, 0 replies; 39+ messages in thread
From: Stefan Strogin @ 2015-01-23 12:32 UTC (permalink / raw)
  To: Gioh Kim, linux-mm, linux-kernel
  Cc: Stefan Strogin, Michal Nazarewicz, Joonsoo Kim, Andrew Morton,
	Marek Szyprowski, aneesh.kumar, Laurent Pinchart, Pintu Kumar,
	Weijie Yang, Laura Abbott, Hui Zhu, Minchan Kim, Dyasly Sergey,
	Vyacheslav Tyrtov, s.strogin

Hello Gioh,

On 31/12/14 04:14, Gioh Kim wrote:
>
> Is it ok if the information is too big?
> I'm not sure but I remember that seq_printf has 4K limitation.

Thanks for reviewing, excuse me for a long delay.

If I understand correctly it is OK, since it's written in comments for
seq_has_overflowed():
>  * seq_files have a buffer which may overflow. When this happens a larger
>  * buffer is reallocated and all the data will be printed again.
>  * The overflow state is true when m->count == m->size.
And exactly this happens in traverse().

But I think that it's not important anymore as I intent not to use
seq_files in the second version.


>
> So I made seq_operations with seq_list_start/next functions.
>
> EX)
>
> static void *debug_seq_start(struct seq_file *s, loff_t *pos)
> {
> A>>       mutex_lock(&debug_lock);
> A>>       return seq_list_start(&debug_list, *pos);
> }   
>
> static void debug_seq_stop(struct seq_file *s, void *data)
> {
> A>>       struct debug_header *header = data;
>
> A>>       if (header == NULL || &header->head_list == &debug_list) {
> A>>       A>>       seq_printf(s, "end of info");
> A>>       }
>
> A>>       mutex_unlock(&debug_lock);
> }
>
> static void *debug_seq_next(struct seq_file *s, void *data, loff_t *pos)
> {
> A>>       return seq_list_next(data, &debug_list, pos);
> }
>
> static int debug_seq_show(struct seq_file *sfile, void *data)
> {
> A>>       struct debug_header *header;
> A>>       char *p;
>
> A>>       header= list_entry(data,
> A>>       A>>       A>>          struct debug_header,   
> A>>       A>>       A>>          head_list);
>
> A>>       seq_printf(sfile, "print info");
> A>>       return 0;
> }
> static const struct seq_operations debug_seq_ops = {
> A>>       .start = debug_seq_start,   
> A>>       .next = debug_seq_next,   
> A>>       .stop = debug_seq_stop,   
> A>>       .show = debug_seq_show,   
> };

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] mm: cma: introduce /proc/cmainfo
  2014-12-26 14:39 ` [PATCH 2/3] mm: cma: introduce /proc/cmainfo Stefan I. Strogin
  2014-12-26 16:02   ` Michal Nazarewicz
@ 2014-12-29 21:11   ` Laura Abbott
  2015-01-21 14:18     ` Stefan Strogin
  2014-12-30  4:38   ` Joonsoo Kim
  2 siblings, 1 reply; 39+ messages in thread
From: Laura Abbott @ 2014-12-29 21:11 UTC (permalink / raw)
  To: Stefan I. Strogin, linux-mm, linux-kernel
  Cc: Joonsoo Kim, Andrew Morton, Marek Szyprowski, Michal Nazarewicz,
	aneesh.kumar, Laurent Pinchart, Dmitry Safonov, Pintu Kumar,
	Weijie Yang, SeongJae Park, Hui Zhu, Minchan Kim, Dyasly Sergey,
	Vyacheslav Tyrtov

On 12/26/2014 6:39 AM, Stefan I. Strogin wrote:
> /proc/cmainfo contains a list of currently allocated CMA buffers for every
> CMA area when CONFIG_CMA_DEBUG is enabled.
>
> Format is:
>
> <base_phys_addr> - <end_phys_addr> (<size> kB), allocated by <PID>\
> 		(<command name>), latency <allocation latency> us
>   <stack backtrace when the buffer had been allocated>
>
> Signed-off-by: Stefan I. Strogin <s.strogin@partner.samsung.com>
> ---
...
> +static int __init proc_cmainfo_init(void)
> +{
> +	proc_create("cmainfo", S_IRUSR, NULL, &proc_cmainfo_operations);
> +	return 0;
> +}
> +
> +module_init(proc_cmainfo_init);
> +#endif /* CONFIG_CMA_DEBUG */
>

This seems better suited to debugfs over procfs, especially since the
option can be turned off. It would be helpful to break it
down by cma region as well to make it easier on systems with a lot
of regions.

Thanks,
Laura

-- 
Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] mm: cma: introduce /proc/cmainfo
  2014-12-29 21:11   ` Laura Abbott
@ 2015-01-21 14:18     ` Stefan Strogin
  0 siblings, 0 replies; 39+ messages in thread
From: Stefan Strogin @ 2015-01-21 14:18 UTC (permalink / raw)
  To: Laura Abbott, linux-mm, linux-kernel
  Cc: Joonsoo Kim, Andrew Morton, Marek Szyprowski, Michal Nazarewicz,
	aneesh.kumar, Laurent Pinchart, Dmitry Safonov, Pintu Kumar,
	Weijie Yang, SeongJae Park, Hui Zhu, Minchan Kim, Dyasly Sergey,
	Vyacheslav Tyrtov, s.strogin, stefan.strogin

Hello Laura,

On 30/12/14 00:11, Laura Abbott wrote:
>
> This seems better suited to debugfs over procfs, especially since the
> option can be turned off. It would be helpful to break it
> down by cma region as well to make it easier on systems with a lot
> of regions.
>
> Thanks,
> Laura
>

I thought that cmainfo is very similar to vmallocinfo, therefore put it
to procfs. However it seems I have no other choice than debugfs as Pavel
Machek wrote :-)
> We should not add new non-process related files in /proc.
(https://lkml.org/lkml/2015/1/2/6)

And thanks, I agree that breaking it down by CMA region would be useful.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] mm: cma: introduce /proc/cmainfo
  2014-12-26 14:39 ` [PATCH 2/3] mm: cma: introduce /proc/cmainfo Stefan I. Strogin
  2014-12-26 16:02   ` Michal Nazarewicz
  2014-12-29 21:11   ` Laura Abbott
@ 2014-12-30  4:38   ` Joonsoo Kim
  2015-01-22 15:35     ` Stefan Strogin
  2 siblings, 1 reply; 39+ messages in thread
From: Joonsoo Kim @ 2014-12-30  4:38 UTC (permalink / raw)
  To: Stefan I. Strogin
  Cc: linux-mm, linux-kernel, Andrew Morton, Marek Szyprowski,
	Michal Nazarewicz, aneesh.kumar, Laurent Pinchart, Dmitry Safonov,
	Pintu Kumar, Weijie Yang, Laura Abbott, SeongJae Park, Hui Zhu,
	Minchan Kim, Dyasly Sergey, Vyacheslav Tyrtov

On Fri, Dec 26, 2014 at 05:39:03PM +0300, Stefan I. Strogin wrote:
> /proc/cmainfo contains a list of currently allocated CMA buffers for every
> CMA area when CONFIG_CMA_DEBUG is enabled.

Hello,

I think that providing these information looks useful, but, we need better
implementation. As Laura said, it is better to use debugfs. And,
instead of re-implementing the wheel, how about using tracepoint
to print these information? See below comments.

> 
> Format is:
> 
> <base_phys_addr> - <end_phys_addr> (<size> kB), allocated by <PID>\
> 		(<command name>), latency <allocation latency> us
>  <stack backtrace when the buffer had been allocated>
> 
> Signed-off-by: Stefan I. Strogin <s.strogin@partner.samsung.com>
> ---
>  mm/cma.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 202 insertions(+)
> 
> diff --git a/mm/cma.c b/mm/cma.c
> index a85ae28..ffaea26 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -34,6 +34,10 @@
>  #include <linux/cma.h>
>  #include <linux/highmem.h>
>  #include <linux/io.h>
> +#include <linux/list.h>
> +#include <linux/proc_fs.h>
> +#include <linux/uaccess.h>
> +#include <linux/time.h>
>  
>  struct cma {
>  	unsigned long	base_pfn;
> @@ -41,8 +45,25 @@ struct cma {
>  	unsigned long	*bitmap;
>  	unsigned int order_per_bit; /* Order of pages represented by one bit */
>  	struct mutex	lock;
> +#ifdef CONFIG_CMA_DEBUG
> +	struct list_head buffers_list;
> +	struct mutex	list_lock;
> +#endif
>  };
>  
> +#ifdef CONFIG_CMA_DEBUG
> +struct cma_buffer {
> +	unsigned long pfn;
> +	unsigned long count;
> +	pid_t pid;
> +	char comm[TASK_COMM_LEN];
> +	unsigned int latency;
> +	unsigned long trace_entries[16];
> +	unsigned int nr_entries;
> +	struct list_head list;
> +};
> +#endif
> +
>  static struct cma cma_areas[MAX_CMA_AREAS];
>  static unsigned cma_area_count;
>  static DEFINE_MUTEX(cma_mutex);
> @@ -132,6 +153,10 @@ static int __init cma_activate_area(struct cma *cma)
>  	} while (--i);
>  
>  	mutex_init(&cma->lock);
> +#ifdef CONFIG_CMA_DEBUG
> +	INIT_LIST_HEAD(&cma->buffers_list);
> +	mutex_init(&cma->list_lock);
> +#endif
>  	return 0;
>  
>  err:
> @@ -347,6 +372,86 @@ err:
>  	return ret;
>  }
>  
> +#ifdef CONFIG_CMA_DEBUG
> +/**
> + * cma_buffer_list_add() - add a new entry to a list of allocated buffers
> + * @cma:     Contiguous memory region for which the allocation is performed.
> + * @pfn:     Base PFN of the allocated buffer.
> + * @count:   Number of allocated pages.
> + * @latency: Nanoseconds spent to allocate the buffer.
> + *
> + * This function adds a new entry to the list of allocated contiguous memory
> + * buffers in a CMA area. It uses the CMA area specificated by the device
> + * if available or the default global one otherwise.
> + */
> +static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
> +			       int count, s64 latency)
> +{
> +	struct cma_buffer *cmabuf;
> +	struct stack_trace trace;
> +
> +	cmabuf = kmalloc(sizeof(struct cma_buffer), GFP_KERNEL);
> +	if (!cmabuf)
> +		return -ENOMEM;
> +
> +	trace.nr_entries = 0;
> +	trace.max_entries = ARRAY_SIZE(cmabuf->trace_entries);
> +	trace.entries = &cmabuf->trace_entries[0];
> +	trace.skip = 2;
> +	save_stack_trace(&trace);
> +
> +	cmabuf->pfn = pfn;
> +	cmabuf->count = count;
> +	cmabuf->pid = task_pid_nr(current);
> +	cmabuf->nr_entries = trace.nr_entries;
> +	get_task_comm(cmabuf->comm, current);
> +	cmabuf->latency = (unsigned int) div_s64(latency, NSEC_PER_USEC);
> +
> +	mutex_lock(&cma->list_lock);
> +	list_add_tail(&cmabuf->list, &cma->buffers_list);
> +	mutex_unlock(&cma->list_lock);
> +
> +	return 0;
> +}
> +
> +/**
> + * cma_buffer_list_del() - delete an entry from a list of allocated buffers
> + * @cma:   Contiguous memory region for which the allocation was performed.
> + * @pfn:   Base PFN of the released buffer.
> + *
> + * This function deletes a list entry added by cma_buffer_list_add().
> + */
> +static void cma_buffer_list_del(struct cma *cma, unsigned long pfn)
> +{
> +	struct cma_buffer *cmabuf;
> +
> +	mutex_lock(&cma->list_lock);
> +
> +	list_for_each_entry(cmabuf, &cma->buffers_list, list)
> +		if (cmabuf->pfn == pfn) {
> +			list_del(&cmabuf->list);
> +			kfree(cmabuf);
> +			goto out;
> +		}
> +

Is there more elegant way to find buffer? This linear search overhead
would change system behaviour if there are lots of buffers.

> +	pr_err("%s(pfn %lu): couldn't find buffers list entry\n",
> +	       __func__, pfn);
> +
> +out:
> +	mutex_unlock(&cma->list_lock);
> +}
> +#else
> +static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
> +			       int count, s64 latency)
> +{
> +	return 0;
> +}
> +
> +static void cma_buffer_list_del(struct cma *cma, unsigned long pfn)
> +{
> +}
> +#endif /* CONFIG_CMA_DEBUG */
> +
>  /**
>   * cma_alloc() - allocate pages from contiguous area
>   * @cma:   Contiguous memory region for which the allocation is performed.
> @@ -361,11 +466,15 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
>  	unsigned long mask, offset, pfn, start = 0;
>  	unsigned long bitmap_maxno, bitmap_no, bitmap_count;
>  	struct page *page = NULL;
> +	struct timespec ts1, ts2;
> +	s64 latency;
>  	int ret;
>  
>  	if (!cma || !cma->count)
>  		return NULL;
>  
> +	getnstimeofday(&ts1);
> +
>  	pr_debug("%s(cma %p, count %d, align %d)\n", __func__, (void *)cma,
>  		 count, align);
>  
> @@ -413,6 +522,19 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
>  		start = bitmap_no + mask + 1;
>  	}
>  
> +	getnstimeofday(&ts2);
> +	latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);
> +
> +	if (page) {
> +		ret = cma_buffer_list_add(cma, pfn, count, latency);
> +		if (ret) {
> +			pr_warn("%s(): cma_buffer_list_add() returned %d\n",
> +				__func__, ret);
> +			cma_release(cma, page, count);
> +			page = NULL;
> +		}

So, we would fail to allocate CMA memory if we can't allocate buffer
for debugging. I don't think it makes sense. With tracepoint,
we don't need to allocate buffer in runtime.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] mm: cma: introduce /proc/cmainfo
  2014-12-30  4:38   ` Joonsoo Kim
@ 2015-01-22 15:35     ` Stefan Strogin
  2015-01-23  6:35       ` Joonsoo Kim
  0 siblings, 1 reply; 39+ messages in thread
From: Stefan Strogin @ 2015-01-22 15:35 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: linux-mm, linux-kernel, Andrew Morton, Marek Szyprowski,
	Michal Nazarewicz, aneesh.kumar, Laurent Pinchart, Dmitry Safonov,
	Pintu Kumar, Weijie Yang, Laura Abbott, SeongJae Park, Hui Zhu,
	Minchan Kim, Dyasly Sergey, Vyacheslav Tyrtov

Hello Joonsoo,

On 30/12/14 07:38, Joonsoo Kim wrote:
> On Fri, Dec 26, 2014 at 05:39:03PM +0300, Stefan I. Strogin wrote:
>> /proc/cmainfo contains a list of currently allocated CMA buffers for every
>> CMA area when CONFIG_CMA_DEBUG is enabled.
> Hello,
>
> I think that providing these information looks useful, but, we need better
> implementation. As Laura said, it is better to use debugfs. And,
> instead of re-implementing the wheel, how about using tracepoint
> to print these information? See below comments.

Excuse me for a long delay. I've tried to give a detailed answer here:
https://lkml.org/lkml/2015/1/21/362
Do you mean by <<the re-implemented wheel>> seq_print_stack_trace()? If so
then it was thought to show an owner of each allocated buffer. I used a
similar way as in page_owner: saving stack_trace for each allocation. Do
you think we can use tracepoints instead?


>
>> Format is:
>>
>> <base_phys_addr> - <end_phys_addr> (<size> kB), allocated by <PID>\
>> 		(<command name>), latency <allocation latency> us
>>  <stack backtrace when the buffer had been allocated>
>>
>> Signed-off-by: Stefan I. Strogin <s.strogin@partner.samsung.com>
>> ---
>>  mm/cma.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 202 insertions(+)
>>
>> diff --git a/mm/cma.c b/mm/cma.c
>> index a85ae28..ffaea26 100644
>> --- a/mm/cma.c
>> +++ b/mm/cma.c
>> @@ -34,6 +34,10 @@
>>  #include <linux/cma.h>
>>  #include <linux/highmem.h>
>>  #include <linux/io.h>
>> +#include <linux/list.h>
>> +#include <linux/proc_fs.h>
>> +#include <linux/uaccess.h>
>> +#include <linux/time.h>
>>  
>>  struct cma {
>>  	unsigned long	base_pfn;
>> @@ -41,8 +45,25 @@ struct cma {
>>  	unsigned long	*bitmap;
>>  	unsigned int order_per_bit; /* Order of pages represented by one bit */
>>  	struct mutex	lock;
>> +#ifdef CONFIG_CMA_DEBUG
>> +	struct list_head buffers_list;
>> +	struct mutex	list_lock;
>> +#endif
>>  };
>>  
>> +#ifdef CONFIG_CMA_DEBUG
>> +struct cma_buffer {
>> +	unsigned long pfn;
>> +	unsigned long count;
>> +	pid_t pid;
>> +	char comm[TASK_COMM_LEN];
>> +	unsigned int latency;
>> +	unsigned long trace_entries[16];
>> +	unsigned int nr_entries;
>> +	struct list_head list;
>> +};
>> +#endif
>> +
>>  static struct cma cma_areas[MAX_CMA_AREAS];
>>  static unsigned cma_area_count;
>>  static DEFINE_MUTEX(cma_mutex);
>> @@ -132,6 +153,10 @@ static int __init cma_activate_area(struct cma *cma)
>>  	} while (--i);
>>  
>>  	mutex_init(&cma->lock);
>> +#ifdef CONFIG_CMA_DEBUG
>> +	INIT_LIST_HEAD(&cma->buffers_list);
>> +	mutex_init(&cma->list_lock);
>> +#endif
>>  	return 0;
>>  
>>  err:
>> @@ -347,6 +372,86 @@ err:
>>  	return ret;
>>  }
>>  
>> +#ifdef CONFIG_CMA_DEBUG
>> +/**
>> + * cma_buffer_list_add() - add a new entry to a list of allocated buffers
>> + * @cma:     Contiguous memory region for which the allocation is performed.
>> + * @pfn:     Base PFN of the allocated buffer.
>> + * @count:   Number of allocated pages.
>> + * @latency: Nanoseconds spent to allocate the buffer.
>> + *
>> + * This function adds a new entry to the list of allocated contiguous memory
>> + * buffers in a CMA area. It uses the CMA area specificated by the device
>> + * if available or the default global one otherwise.
>> + */
>> +static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
>> +			       int count, s64 latency)
>> +{
>> +	struct cma_buffer *cmabuf;
>> +	struct stack_trace trace;
>> +
>> +	cmabuf = kmalloc(sizeof(struct cma_buffer), GFP_KERNEL);
>> +	if (!cmabuf)
>> +		return -ENOMEM;
>> +
>> +	trace.nr_entries = 0;
>> +	trace.max_entries = ARRAY_SIZE(cmabuf->trace_entries);
>> +	trace.entries = &cmabuf->trace_entries[0];
>> +	trace.skip = 2;
>> +	save_stack_trace(&trace);
>> +
>> +	cmabuf->pfn = pfn;
>> +	cmabuf->count = count;
>> +	cmabuf->pid = task_pid_nr(current);
>> +	cmabuf->nr_entries = trace.nr_entries;
>> +	get_task_comm(cmabuf->comm, current);
>> +	cmabuf->latency = (unsigned int) div_s64(latency, NSEC_PER_USEC);
>> +
>> +	mutex_lock(&cma->list_lock);
>> +	list_add_tail(&cmabuf->list, &cma->buffers_list);
>> +	mutex_unlock(&cma->list_lock);
>> +
>> +	return 0;
>> +}
>> +
>> +/**
>> + * cma_buffer_list_del() - delete an entry from a list of allocated buffers
>> + * @cma:   Contiguous memory region for which the allocation was performed.
>> + * @pfn:   Base PFN of the released buffer.
>> + *
>> + * This function deletes a list entry added by cma_buffer_list_add().
>> + */
>> +static void cma_buffer_list_del(struct cma *cma, unsigned long pfn)
>> +{
>> +	struct cma_buffer *cmabuf;
>> +
>> +	mutex_lock(&cma->list_lock);
>> +
>> +	list_for_each_entry(cmabuf, &cma->buffers_list, list)
>> +		if (cmabuf->pfn == pfn) {
>> +			list_del(&cmabuf->list);
>> +			kfree(cmabuf);
>> +			goto out;
>> +		}
>> +
> Is there more elegant way to find buffer? This linear search overhead
> would change system behaviour if there are lots of buffers.
>
>> +	pr_err("%s(pfn %lu): couldn't find buffers list entry\n",
>> +	       __func__, pfn);
>> +
>> +out:
>> +	mutex_unlock(&cma->list_lock);
>> +}
>> +#else
>> +static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
>> +			       int count, s64 latency)
>> +{
>> +	return 0;
>> +}
>> +
>> +static void cma_buffer_list_del(struct cma *cma, unsigned long pfn)
>> +{
>> +}
>> +#endif /* CONFIG_CMA_DEBUG */
>> +
>>  /**
>>   * cma_alloc() - allocate pages from contiguous area
>>   * @cma:   Contiguous memory region for which the allocation is performed.
>> @@ -361,11 +466,15 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
>>  	unsigned long mask, offset, pfn, start = 0;
>>  	unsigned long bitmap_maxno, bitmap_no, bitmap_count;
>>  	struct page *page = NULL;
>> +	struct timespec ts1, ts2;
>> +	s64 latency;
>>  	int ret;
>>  
>>  	if (!cma || !cma->count)
>>  		return NULL;
>>  
>> +	getnstimeofday(&ts1);
>> +
>>  	pr_debug("%s(cma %p, count %d, align %d)\n", __func__, (void *)cma,
>>  		 count, align);
>>  
>> @@ -413,6 +522,19 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
>>  		start = bitmap_no + mask + 1;
>>  	}
>>  
>> +	getnstimeofday(&ts2);
>> +	latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);
>> +
>> +	if (page) {
>> +		ret = cma_buffer_list_add(cma, pfn, count, latency);
>> +		if (ret) {
>> +			pr_warn("%s(): cma_buffer_list_add() returned %d\n",
>> +				__func__, ret);
>> +			cma_release(cma, page, count);
>> +			page = NULL;
>> +		}
> So, we would fail to allocate CMA memory if we can't allocate buffer
> for debugging. I don't think it makes sense. With tracepoint,
> we don't need to allocate buffer in runtime.
>
> Thanks.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/3] mm: cma: introduce /proc/cmainfo
  2015-01-22 15:35     ` Stefan Strogin
@ 2015-01-23  6:35       ` Joonsoo Kim
  0 siblings, 0 replies; 39+ messages in thread
From: Joonsoo Kim @ 2015-01-23  6:35 UTC (permalink / raw)
  To: Stefan Strogin
  Cc: linux-mm, linux-kernel, Andrew Morton, Marek Szyprowski,
	Michal Nazarewicz, aneesh.kumar, Laurent Pinchart, Dmitry Safonov,
	Pintu Kumar, Weijie Yang, Laura Abbott, SeongJae Park, Hui Zhu,
	Minchan Kim, Dyasly Sergey, Vyacheslav Tyrtov

On Thu, Jan 22, 2015 at 06:35:53PM +0300, Stefan Strogin wrote:
> Hello Joonsoo,
> 
> On 30/12/14 07:38, Joonsoo Kim wrote:
> > On Fri, Dec 26, 2014 at 05:39:03PM +0300, Stefan I. Strogin wrote:
> >> /proc/cmainfo contains a list of currently allocated CMA buffers for every
> >> CMA area when CONFIG_CMA_DEBUG is enabled.
> > Hello,
> >
> > I think that providing these information looks useful, but, we need better
> > implementation. As Laura said, it is better to use debugfs. And,
> > instead of re-implementing the wheel, how about using tracepoint
> > to print these information? See below comments.
> 
> Excuse me for a long delay. I've tried to give a detailed answer here:
> https://lkml.org/lkml/2015/1/21/362
> Do you mean by <<the re-implemented wheel>> seq_print_stack_trace()? If so
> then it was thought to show an owner of each allocated buffer. I used a
> similar way as in page_owner: saving stack_trace for each allocation. Do
> you think we can use tracepoints instead?

I wrote why I said this is re-implemented wheel on the reply of other mail.
Please refer it.

Thanks.

> 
> 
> >
> >> Format is:
> >>
> >> <base_phys_addr> - <end_phys_addr> (<size> kB), allocated by <PID>\
> >> 		(<command name>), latency <allocation latency> us
> >>  <stack backtrace when the buffer had been allocated>
> >>
> >> Signed-off-by: Stefan I. Strogin <s.strogin@partner.samsung.com>
> >> ---
> >>  mm/cma.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 202 insertions(+)
> >>
> >> diff --git a/mm/cma.c b/mm/cma.c
> >> index a85ae28..ffaea26 100644
> >> --- a/mm/cma.c
> >> +++ b/mm/cma.c
> >> @@ -34,6 +34,10 @@
> >>  #include <linux/cma.h>
> >>  #include <linux/highmem.h>
> >>  #include <linux/io.h>
> >> +#include <linux/list.h>
> >> +#include <linux/proc_fs.h>
> >> +#include <linux/uaccess.h>
> >> +#include <linux/time.h>
> >>  
> >>  struct cma {
> >>  	unsigned long	base_pfn;
> >> @@ -41,8 +45,25 @@ struct cma {
> >>  	unsigned long	*bitmap;
> >>  	unsigned int order_per_bit; /* Order of pages represented by one bit */
> >>  	struct mutex	lock;
> >> +#ifdef CONFIG_CMA_DEBUG
> >> +	struct list_head buffers_list;
> >> +	struct mutex	list_lock;
> >> +#endif
> >>  };
> >>  
> >> +#ifdef CONFIG_CMA_DEBUG
> >> +struct cma_buffer {
> >> +	unsigned long pfn;
> >> +	unsigned long count;
> >> +	pid_t pid;
> >> +	char comm[TASK_COMM_LEN];
> >> +	unsigned int latency;
> >> +	unsigned long trace_entries[16];
> >> +	unsigned int nr_entries;
> >> +	struct list_head list;
> >> +};
> >> +#endif
> >> +
> >>  static struct cma cma_areas[MAX_CMA_AREAS];
> >>  static unsigned cma_area_count;
> >>  static DEFINE_MUTEX(cma_mutex);
> >> @@ -132,6 +153,10 @@ static int __init cma_activate_area(struct cma *cma)
> >>  	} while (--i);
> >>  
> >>  	mutex_init(&cma->lock);
> >> +#ifdef CONFIG_CMA_DEBUG
> >> +	INIT_LIST_HEAD(&cma->buffers_list);
> >> +	mutex_init(&cma->list_lock);
> >> +#endif
> >>  	return 0;
> >>  
> >>  err:
> >> @@ -347,6 +372,86 @@ err:
> >>  	return ret;
> >>  }
> >>  
> >> +#ifdef CONFIG_CMA_DEBUG
> >> +/**
> >> + * cma_buffer_list_add() - add a new entry to a list of allocated buffers
> >> + * @cma:     Contiguous memory region for which the allocation is performed.
> >> + * @pfn:     Base PFN of the allocated buffer.
> >> + * @count:   Number of allocated pages.
> >> + * @latency: Nanoseconds spent to allocate the buffer.
> >> + *
> >> + * This function adds a new entry to the list of allocated contiguous memory
> >> + * buffers in a CMA area. It uses the CMA area specificated by the device
> >> + * if available or the default global one otherwise.
> >> + */
> >> +static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
> >> +			       int count, s64 latency)
> >> +{
> >> +	struct cma_buffer *cmabuf;
> >> +	struct stack_trace trace;
> >> +
> >> +	cmabuf = kmalloc(sizeof(struct cma_buffer), GFP_KERNEL);
> >> +	if (!cmabuf)
> >> +		return -ENOMEM;
> >> +
> >> +	trace.nr_entries = 0;
> >> +	trace.max_entries = ARRAY_SIZE(cmabuf->trace_entries);
> >> +	trace.entries = &cmabuf->trace_entries[0];
> >> +	trace.skip = 2;
> >> +	save_stack_trace(&trace);
> >> +
> >> +	cmabuf->pfn = pfn;
> >> +	cmabuf->count = count;
> >> +	cmabuf->pid = task_pid_nr(current);
> >> +	cmabuf->nr_entries = trace.nr_entries;
> >> +	get_task_comm(cmabuf->comm, current);
> >> +	cmabuf->latency = (unsigned int) div_s64(latency, NSEC_PER_USEC);
> >> +
> >> +	mutex_lock(&cma->list_lock);
> >> +	list_add_tail(&cmabuf->list, &cma->buffers_list);
> >> +	mutex_unlock(&cma->list_lock);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +/**
> >> + * cma_buffer_list_del() - delete an entry from a list of allocated buffers
> >> + * @cma:   Contiguous memory region for which the allocation was performed.
> >> + * @pfn:   Base PFN of the released buffer.
> >> + *
> >> + * This function deletes a list entry added by cma_buffer_list_add().
> >> + */
> >> +static void cma_buffer_list_del(struct cma *cma, unsigned long pfn)
> >> +{
> >> +	struct cma_buffer *cmabuf;
> >> +
> >> +	mutex_lock(&cma->list_lock);
> >> +
> >> +	list_for_each_entry(cmabuf, &cma->buffers_list, list)
> >> +		if (cmabuf->pfn == pfn) {
> >> +			list_del(&cmabuf->list);
> >> +			kfree(cmabuf);
> >> +			goto out;
> >> +		}
> >> +
> > Is there more elegant way to find buffer? This linear search overhead
> > would change system behaviour if there are lots of buffers.
> >
> >> +	pr_err("%s(pfn %lu): couldn't find buffers list entry\n",
> >> +	       __func__, pfn);
> >> +
> >> +out:
> >> +	mutex_unlock(&cma->list_lock);
> >> +}
> >> +#else
> >> +static int cma_buffer_list_add(struct cma *cma, unsigned long pfn,
> >> +			       int count, s64 latency)
> >> +{
> >> +	return 0;
> >> +}
> >> +
> >> +static void cma_buffer_list_del(struct cma *cma, unsigned long pfn)
> >> +{
> >> +}
> >> +#endif /* CONFIG_CMA_DEBUG */
> >> +
> >>  /**
> >>   * cma_alloc() - allocate pages from contiguous area
> >>   * @cma:   Contiguous memory region for which the allocation is performed.
> >> @@ -361,11 +466,15 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
> >>  	unsigned long mask, offset, pfn, start = 0;
> >>  	unsigned long bitmap_maxno, bitmap_no, bitmap_count;
> >>  	struct page *page = NULL;
> >> +	struct timespec ts1, ts2;
> >> +	s64 latency;
> >>  	int ret;
> >>  
> >>  	if (!cma || !cma->count)
> >>  		return NULL;
> >>  
> >> +	getnstimeofday(&ts1);
> >> +
> >>  	pr_debug("%s(cma %p, count %d, align %d)\n", __func__, (void *)cma,
> >>  		 count, align);
> >>  
> >> @@ -413,6 +522,19 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
> >>  		start = bitmap_no + mask + 1;
> >>  	}
> >>  
> >> +	getnstimeofday(&ts2);
> >> +	latency = timespec_to_ns(&ts2) - timespec_to_ns(&ts1);
> >> +
> >> +	if (page) {
> >> +		ret = cma_buffer_list_add(cma, pfn, count, latency);
> >> +		if (ret) {
> >> +			pr_warn("%s(): cma_buffer_list_add() returned %d\n",
> >> +				__func__, ret);
> >> +			cma_release(cma, page, count);
> >> +			page = NULL;
> >> +		}
> > So, we would fail to allocate CMA memory if we can't allocate buffer
> > for debugging. I don't think it makes sense. With tracepoint,
> > we don't need to allocate buffer in runtime.
> >
> > Thanks.
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 3/3] cma: add functions to get region pages counters
  2014-12-26 14:39 [PATCH 0/3] mm: cma: /proc/cmainfo Stefan I. Strogin
  2014-12-26 14:39 ` [PATCH 1/3] stacktrace: add seq_print_stack_trace() Stefan I. Strogin
  2014-12-26 14:39 ` [PATCH 2/3] mm: cma: introduce /proc/cmainfo Stefan I. Strogin
@ 2014-12-26 14:39 ` Stefan I. Strogin
  2014-12-26 16:10   ` Michal Nazarewicz
                     ` (2 more replies)
  2014-12-29  2:36 ` [PATCH 0/3] mm: cma: /proc/cmainfo Minchan Kim
  3 siblings, 3 replies; 39+ messages in thread
From: Stefan I. Strogin @ 2014-12-26 14:39 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Dmitry Safonov, s.strogin, Joonsoo Kim, Andrew Morton,
	Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Pintu Kumar, Weijie Yang, Laura Abbott,
	SeongJae Park, Hui Zhu, Minchan Kim, Dyasly Sergey,
	Vyacheslav Tyrtov

From: Dmitry Safonov <d.safonov@partner.samsung.com>

Here are two functions that provide interface to compute/get used size
and size of biggest free chunk in cma region.
Added that information in cmainfo.

Signed-off-by: Dmitry Safonov <d.safonov@partner.samsung.com>
---
 include/linux/cma.h |  2 ++
 mm/cma.c            | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/include/linux/cma.h b/include/linux/cma.h
index 9384ba6..855e6f2 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -18,6 +18,8 @@ struct cma;
 extern unsigned long totalcma_pages;
 extern phys_addr_t cma_get_base(struct cma *cma);
 extern unsigned long cma_get_size(struct cma *cma);
+extern unsigned long cma_get_used(struct cma *cma);
+extern unsigned long cma_get_maxchunk(struct cma *cma);
 
 extern int __init cma_declare_contiguous(phys_addr_t base,
 			phys_addr_t size, phys_addr_t limit,
diff --git a/mm/cma.c b/mm/cma.c
index ffaea26..5e560ed 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -78,6 +78,36 @@ unsigned long cma_get_size(struct cma *cma)
 	return cma->count << PAGE_SHIFT;
 }
 
+unsigned long cma_get_used(struct cma *cma)
+{
+	unsigned long ret = 0;
+
+	mutex_lock(&cma->lock);
+	/* pages counter is smaller than sizeof(int) */
+	ret = bitmap_weight(cma->bitmap, (int)cma->count);
+	mutex_unlock(&cma->lock);
+
+	return ret << (PAGE_SHIFT + cma->order_per_bit);
+}
+
+unsigned long cma_get_maxchunk(struct cma *cma)
+{
+	unsigned long maxchunk = 0;
+	unsigned long start, end = 0;
+
+	mutex_lock(&cma->lock);
+	for (;;) {
+		start = find_next_zero_bit(cma->bitmap, cma->count, end);
+		if (start >= cma->count)
+			break;
+		end = find_next_bit(cma->bitmap, cma->count, start);
+		maxchunk = max(end - start, maxchunk);
+	}
+	mutex_unlock(&cma->lock);
+
+	return maxchunk << (PAGE_SHIFT + cma->order_per_bit);
+}
+
 static unsigned long cma_bitmap_aligned_mask(struct cma *cma, int align_order)
 {
 	if (align_order <= cma->order_per_bit)
@@ -591,6 +621,10 @@ static int s_show(struct seq_file *m, void *p)
 	struct cma_buffer *cmabuf;
 	struct stack_trace trace;
 
+	seq_printf(m, "CMARegion stat: %8lu kB total, %8lu kB used, %8lu kB max contiguous chunk\n\n",
+		   cma_get_size(cma) >> 10,
+		   cma_get_used(cma) >> 10,
+		   cma_get_maxchunk(cma) >> 10);
 	mutex_lock(&cma->list_lock);
 
 	list_for_each_entry(cmabuf, &cma->buffers_list, list) {
-- 
2.1.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/3] cma: add functions to get region pages counters
  2014-12-26 14:39 ` [PATCH 3/3] cma: add functions to get region pages counters Stefan I. Strogin
@ 2014-12-26 16:10   ` Michal Nazarewicz
  2014-12-27  7:18   ` SeongJae Park
  2014-12-30  2:26   ` Joonsoo Kim
  2 siblings, 0 replies; 39+ messages in thread
From: Michal Nazarewicz @ 2014-12-26 16:10 UTC (permalink / raw)
  To: Stefan I. Strogin, linux-mm, linux-kernel
  Cc: Dmitry Safonov, Joonsoo Kim, Andrew Morton, Marek Szyprowski,
	aneesh.kumar, Laurent Pinchart, Pintu Kumar, Weijie Yang,
	Laura Abbott, SeongJae Park, Hui Zhu, Minchan Kim, Dyasly Sergey,
	Vyacheslav Tyrtov

On Fri, Dec 26 2014, "Stefan I. Strogin" <s.strogin@partner.samsung.com> wrote:
> From: Dmitry Safonov <d.safonov@partner.samsung.com>
>
> Here are two functions that provide interface to compute/get used size
> and size of biggest free chunk in cma region.
> Added that information in cmainfo.
>
> Signed-off-by: Dmitry Safonov <d.safonov@partner.samsung.com>

Acked-by: Michal Nazarewicz <mina86@mina86.com>

> ---
>  include/linux/cma.h |  2 ++
>  mm/cma.c            | 34 ++++++++++++++++++++++++++++++++++
>  2 files changed, 36 insertions(+)
>
> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index 9384ba6..855e6f2 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -18,6 +18,8 @@ struct cma;
>  extern unsigned long totalcma_pages;
>  extern phys_addr_t cma_get_base(struct cma *cma);
>  extern unsigned long cma_get_size(struct cma *cma);
> +extern unsigned long cma_get_used(struct cma *cma);
> +extern unsigned long cma_get_maxchunk(struct cma *cma);
>  
>  extern int __init cma_declare_contiguous(phys_addr_t base,
>  			phys_addr_t size, phys_addr_t limit,
> diff --git a/mm/cma.c b/mm/cma.c
> index ffaea26..5e560ed 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -78,6 +78,36 @@ unsigned long cma_get_size(struct cma *cma)
>  	return cma->count << PAGE_SHIFT;
>  }
>  
> +unsigned long cma_get_used(struct cma *cma)
> +{
> +	unsigned long ret = 0;
> +
> +	mutex_lock(&cma->lock);
> +	/* pages counter is smaller than sizeof(int) */
> +	ret = bitmap_weight(cma->bitmap, (int)cma->count);
> +	mutex_unlock(&cma->lock);
> +
> +	return ret << (PAGE_SHIFT + cma->order_per_bit);
> +}
> +
> +unsigned long cma_get_maxchunk(struct cma *cma)
> +{
> +	unsigned long maxchunk = 0;
> +	unsigned long start, end = 0;
> +
> +	mutex_lock(&cma->lock);
> +	for (;;) {
> +		start = find_next_zero_bit(cma->bitmap, cma->count, end);
> +		if (start >= cma->count)
> +			break;
> +		end = find_next_bit(cma->bitmap, cma->count, start);
> +		maxchunk = max(end - start, maxchunk);
> +	}
> +	mutex_unlock(&cma->lock);
> +
> +	return maxchunk << (PAGE_SHIFT + cma->order_per_bit);
> +}
> +
>  static unsigned long cma_bitmap_aligned_mask(struct cma *cma, int align_order)
>  {
>  	if (align_order <= cma->order_per_bit)
> @@ -591,6 +621,10 @@ static int s_show(struct seq_file *m, void *p)
>  	struct cma_buffer *cmabuf;
>  	struct stack_trace trace;
>  
> +	seq_printf(m, "CMARegion stat: %8lu kB total, %8lu kB used, %8lu kB max contiguous chunk\n\n",
> +		   cma_get_size(cma) >> 10,
> +		   cma_get_used(cma) >> 10,
> +		   cma_get_maxchunk(cma) >> 10);
>  	mutex_lock(&cma->list_lock);
>  
>  	list_for_each_entry(cmabuf, &cma->buffers_list, list) {
> -- 
> 2.1.0
>

-- 
Best regards,                                         _     _
.o. | Liege of Serenely Enlightened Majesty of      o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz    (o o)
ooo +--<mpn@google.com>--<xmpp:mina86@jabber.org>--ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/3] cma: add functions to get region pages counters
  2014-12-26 14:39 ` [PATCH 3/3] cma: add functions to get region pages counters Stefan I. Strogin
  2014-12-26 16:10   ` Michal Nazarewicz
@ 2014-12-27  7:18   ` SeongJae Park
  2014-12-29  5:56     ` Safonov Dmitry
  2014-12-30  2:26   ` Joonsoo Kim
  2 siblings, 1 reply; 39+ messages in thread
From: SeongJae Park @ 2014-12-27  7:18 UTC (permalink / raw)
  To: Stefan I. Strogin
  Cc: linux-mm, linux-kernel, Dmitry Safonov, Joonsoo Kim,
	Andrew Morton, Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Pintu Kumar, Weijie Yang, Laura Abbott,
	SeongJae Park, Hui Zhu, Minchan Kim, Dyasly Sergey,
	Vyacheslav Tyrtov

Hello,

On Fri, 26 Dec 2014, Stefan I. Strogin wrote:

> From: Dmitry Safonov <d.safonov@partner.samsung.com>
>
> Here are two functions that provide interface to compute/get used size
> and size of biggest free chunk in cma region.
> Added that information in cmainfo.
>
> Signed-off-by: Dmitry Safonov <d.safonov@partner.samsung.com>
> ---
> include/linux/cma.h |  2 ++
> mm/cma.c            | 34 ++++++++++++++++++++++++++++++++++
> 2 files changed, 36 insertions(+)
>
> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index 9384ba6..855e6f2 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -18,6 +18,8 @@ struct cma;
> extern unsigned long totalcma_pages;
> extern phys_addr_t cma_get_base(struct cma *cma);
> extern unsigned long cma_get_size(struct cma *cma);
> +extern unsigned long cma_get_used(struct cma *cma);
> +extern unsigned long cma_get_maxchunk(struct cma *cma);
>
> extern int __init cma_declare_contiguous(phys_addr_t base,
> 			phys_addr_t size, phys_addr_t limit,
> diff --git a/mm/cma.c b/mm/cma.c
> index ffaea26..5e560ed 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -78,6 +78,36 @@ unsigned long cma_get_size(struct cma *cma)
> 	return cma->count << PAGE_SHIFT;
> }
>
> +unsigned long cma_get_used(struct cma *cma)
> +{
> +	unsigned long ret = 0;
> +
> +	mutex_lock(&cma->lock);
> +	/* pages counter is smaller than sizeof(int) */
> +	ret = bitmap_weight(cma->bitmap, (int)cma->count);
> +	mutex_unlock(&cma->lock);
> +
> +	return ret << (PAGE_SHIFT + cma->order_per_bit);
> +}
> +
> +unsigned long cma_get_maxchunk(struct cma *cma)
> +{
> +	unsigned long maxchunk = 0;
> +	unsigned long start, end = 0;
> +
> +	mutex_lock(&cma->lock);
> +	for (;;) {
> +		start = find_next_zero_bit(cma->bitmap, cma->count, end);
> +		if (start >= cma->count)
> +			break;
> +		end = find_next_bit(cma->bitmap, cma->count, start);
> +		maxchunk = max(end - start, maxchunk);
> +	}
> +	mutex_unlock(&cma->lock);
> +
> +	return maxchunk << (PAGE_SHIFT + cma->order_per_bit);
> +}
> +
> static unsigned long cma_bitmap_aligned_mask(struct cma *cma, int align_order)
> {
> 	if (align_order <= cma->order_per_bit)
> @@ -591,6 +621,10 @@ static int s_show(struct seq_file *m, void *p)
> 	struct cma_buffer *cmabuf;
> 	struct stack_trace trace;
>
> +	seq_printf(m, "CMARegion stat: %8lu kB total, %8lu kB used, %8lu kB max contiguous chunk\n\n",

How about 'CMA Region' rather than 'CMARegion'?

> +		   cma_get_size(cma) >> 10,
> +		   cma_get_used(cma) >> 10,
> +		   cma_get_maxchunk(cma) >> 10);
> 	mutex_lock(&cma->list_lock);
>
> 	list_for_each_entry(cmabuf, &cma->buffers_list, list) {
> -- 
> 2.1.0
>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/3] cma: add functions to get region pages counters
  2014-12-27  7:18   ` SeongJae Park
@ 2014-12-29  5:56     ` Safonov Dmitry
  2014-12-29 14:12       ` Stefan Strogin
  0 siblings, 1 reply; 39+ messages in thread
From: Safonov Dmitry @ 2014-12-29  5:56 UTC (permalink / raw)
  To: SeongJae Park, Stefan I. Strogin
  Cc: linux-mm, linux-kernel, Joonsoo Kim, Andrew Morton,
	Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Pintu Kumar, Weijie Yang, Laura Abbott, Hui Zhu,
	Minchan Kim, Dyasly Sergey, Vyacheslav Tyrtov


On 12/27/2014 10:18 AM, SeongJae Park wrote:
> Hello,
>
> How about 'CMA Region' rather than 'CMARegion'?
Sure.

-- 
Best regards,
Safonov Dmitry.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/3] cma: add functions to get region pages counters
  2014-12-29  5:56     ` Safonov Dmitry
@ 2014-12-29 14:12       ` Stefan Strogin
  0 siblings, 0 replies; 39+ messages in thread
From: Stefan Strogin @ 2014-12-29 14:12 UTC (permalink / raw)
  To: Safonov Dmitry, SeongJae Park, Stefan I. Strogin
  Cc: linux-mm, linux-kernel, Joonsoo Kim, Andrew Morton,
	Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Pintu Kumar, Weijie Yang, Laura Abbott, Hui Zhu,
	Minchan Kim, Dyasly Sergey, Vyacheslav Tyrtov

29.12.2014 06:56, Safonov Dmitry D?D,N?DuN?:
>
> On 12/27/2014 10:18 AM, SeongJae Park wrote:
>> Hello,
>>
>> How about 'CMA Region' rather than 'CMARegion'?
> Sure.
>

I would like "CMA area..." :)
Or rather "CMA area #%u: base 0x%llx...",
		   cma - &cma_areas[0],
		   (unsigned long long)cma_get_base(cma),

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/3] cma: add functions to get region pages counters
  2014-12-26 14:39 ` [PATCH 3/3] cma: add functions to get region pages counters Stefan I. Strogin
  2014-12-26 16:10   ` Michal Nazarewicz
  2014-12-27  7:18   ` SeongJae Park
@ 2014-12-30  2:26   ` Joonsoo Kim
  2014-12-30 14:41     ` Michal Nazarewicz
  2 siblings, 1 reply; 39+ messages in thread
From: Joonsoo Kim @ 2014-12-30  2:26 UTC (permalink / raw)
  To: Stefan I. Strogin
  Cc: linux-mm, linux-kernel, Dmitry Safonov, Andrew Morton,
	Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Pintu Kumar, Weijie Yang, Laura Abbott,
	SeongJae Park, Hui Zhu, Minchan Kim, Dyasly Sergey,
	Vyacheslav Tyrtov

On Fri, Dec 26, 2014 at 05:39:04PM +0300, Stefan I. Strogin wrote:
> From: Dmitry Safonov <d.safonov@partner.samsung.com>
> 
> Here are two functions that provide interface to compute/get used size
> and size of biggest free chunk in cma region.
> Added that information in cmainfo.
> 
> Signed-off-by: Dmitry Safonov <d.safonov@partner.samsung.com>
> ---
>  include/linux/cma.h |  2 ++
>  mm/cma.c            | 34 ++++++++++++++++++++++++++++++++++
>  2 files changed, 36 insertions(+)
> 
> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index 9384ba6..855e6f2 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -18,6 +18,8 @@ struct cma;
>  extern unsigned long totalcma_pages;
>  extern phys_addr_t cma_get_base(struct cma *cma);
>  extern unsigned long cma_get_size(struct cma *cma);
> +extern unsigned long cma_get_used(struct cma *cma);
> +extern unsigned long cma_get_maxchunk(struct cma *cma);
>  
>  extern int __init cma_declare_contiguous(phys_addr_t base,
>  			phys_addr_t size, phys_addr_t limit,
> diff --git a/mm/cma.c b/mm/cma.c
> index ffaea26..5e560ed 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -78,6 +78,36 @@ unsigned long cma_get_size(struct cma *cma)
>  	return cma->count << PAGE_SHIFT;
>  }
>  
> +unsigned long cma_get_used(struct cma *cma)
> +{
> +	unsigned long ret = 0;
> +
> +	mutex_lock(&cma->lock);
> +	/* pages counter is smaller than sizeof(int) */
> +	ret = bitmap_weight(cma->bitmap, (int)cma->count);
> +	mutex_unlock(&cma->lock);
> +
> +	return ret << (PAGE_SHIFT + cma->order_per_bit);
> +}
> +
> +unsigned long cma_get_maxchunk(struct cma *cma)
> +{
> +	unsigned long maxchunk = 0;
> +	unsigned long start, end = 0;
> +
> +	mutex_lock(&cma->lock);
> +	for (;;) {
> +		start = find_next_zero_bit(cma->bitmap, cma->count, end);
> +		if (start >= cma->count)
> +			break;
> +		end = find_next_bit(cma->bitmap, cma->count, start);
> +		maxchunk = max(end - start, maxchunk);
> +	}
> +	mutex_unlock(&cma->lock);
> +
> +	return maxchunk << (PAGE_SHIFT + cma->order_per_bit);
> +}
> +
>  static unsigned long cma_bitmap_aligned_mask(struct cma *cma, int align_order)
>  {
>  	if (align_order <= cma->order_per_bit)
> @@ -591,6 +621,10 @@ static int s_show(struct seq_file *m, void *p)
>  	struct cma_buffer *cmabuf;
>  	struct stack_trace trace;
>  
> +	seq_printf(m, "CMARegion stat: %8lu kB total, %8lu kB used, %8lu kB max contiguous chunk\n\n",
> +		   cma_get_size(cma) >> 10,
> +		   cma_get_used(cma) >> 10,
> +		   cma_get_maxchunk(cma) >> 10);
>  	mutex_lock(&cma->list_lock);
>  
>  	list_for_each_entry(cmabuf, &cma->buffers_list, list) {

Hello,

How about changing printing format like as meminfo or zoneinfo?

CMARegion #
Total: XXX
Used: YYY
MaxContig: ZZZ

It would help to parse information.

And, how about adding how many pages are used now as system pages?
You can implement it by iterating range of CMA region and checking
Buddy flag.

UsedBySystem = Total - UsedByCMA - freepageinCMARegion

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/3] cma: add functions to get region pages counters
  2014-12-30  2:26   ` Joonsoo Kim
@ 2014-12-30 14:41     ` Michal Nazarewicz
  2014-12-30 14:46       ` Safonov Dmitry
  0 siblings, 1 reply; 39+ messages in thread
From: Michal Nazarewicz @ 2014-12-30 14:41 UTC (permalink / raw)
  To: Joonsoo Kim, Stefan I. Strogin
  Cc: linux-mm, linux-kernel, Dmitry Safonov, Andrew Morton,
	Marek Szyprowski, aneesh.kumar, Laurent Pinchart, Pintu Kumar,
	Weijie Yang, Laura Abbott, SeongJae Park, Hui Zhu, Minchan Kim,
	Dyasly Sergey, Vyacheslav Tyrtov

> On Fri, Dec 26, 2014 at 05:39:04PM +0300, Stefan I. Strogin wrote:
>> From: Dmitry Safonov <d.safonov@partner.samsung.com>
>> @@ -591,6 +621,10 @@ static int s_show(struct seq_file *m, void *p)
>>  	struct cma_buffer *cmabuf;
>>  	struct stack_trace trace;
>>  
>> +	seq_printf(m, "CMARegion stat: %8lu kB total, %8lu kB used, %8lu kB max contiguous chunk\n\n",
>> +		   cma_get_size(cma) >> 10,
>> +		   cma_get_used(cma) >> 10,
>> +		   cma_get_maxchunk(cma) >> 10);
>>  	mutex_lock(&cma->list_lock);
>>  
>>  	list_for_each_entry(cmabuf, &cma->buffers_list, list) {

On Tue, Dec 30 2014, Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:
> How about changing printing format like as meminfo or zoneinfo?
>
> CMARegion #
> Total: XXX
> Used: YYY
> MaxContig: ZZZ

+1.  I was also thinking about this actually.

-- 
Best regards,                                         _     _
.o. | Liege of Serenely Enlightened Majesty of      o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz    (o o)
ooo +--<mpn@google.com>--<xmpp:mina86@jabber.org>--ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/3] cma: add functions to get region pages counters
  2014-12-30 14:41     ` Michal Nazarewicz
@ 2014-12-30 14:46       ` Safonov Dmitry
  0 siblings, 0 replies; 39+ messages in thread
From: Safonov Dmitry @ 2014-12-30 14:46 UTC (permalink / raw)
  To: Michal Nazarewicz, Joonsoo Kim, Stefan I. Strogin
  Cc: linux-mm, linux-kernel, Andrew Morton, Marek Szyprowski,
	aneesh.kumar, Laurent Pinchart, Pintu Kumar, Weijie Yang,
	Laura Abbott, SeongJae Park, Hui Zhu, Minchan Kim, Dyasly Sergey,
	Vyacheslav Tyrtov


On 12/30/2014 05:41 PM, Michal Nazarewicz wrote:
>> On Fri, Dec 26, 2014 at 05:39:04PM +0300, Stefan I. Strogin wrote:
>>> From: Dmitry Safonov <d.safonov@partner.samsung.com>
>>> @@ -591,6 +621,10 @@ static int s_show(struct seq_file *m, void *p)
>>>   	struct cma_buffer *cmabuf;
>>>   	struct stack_trace trace;
>>>   
>>> +	seq_printf(m, "CMARegion stat: %8lu kB total, %8lu kB used, %8lu kB max contiguous chunk\n\n",
>>> +		   cma_get_size(cma) >> 10,
>>> +		   cma_get_used(cma) >> 10,
>>> +		   cma_get_maxchunk(cma) >> 10);
>>>   	mutex_lock(&cma->list_lock);
>>>   
>>>   	list_for_each_entry(cmabuf, &cma->buffers_list, list) {
> On Tue, Dec 30 2014, Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:
>> How about changing printing format like as meminfo or zoneinfo?
>>
>> CMARegion #
>> Total: XXX
>> Used: YYY
>> MaxContig: ZZZ
> +1.  I was also thinking about this actually.
>
Yeah, I thought about it. Sure.

-- 
Best regards,
Safonov Dmitry.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2014-12-26 14:39 [PATCH 0/3] mm: cma: /proc/cmainfo Stefan I. Strogin
                   ` (2 preceding siblings ...)
  2014-12-26 14:39 ` [PATCH 3/3] cma: add functions to get region pages counters Stefan I. Strogin
@ 2014-12-29  2:36 ` Minchan Kim
  2014-12-29 19:52   ` Laura Abbott
  2015-01-02  5:11   ` Pavel Machek
  3 siblings, 2 replies; 39+ messages in thread
From: Minchan Kim @ 2014-12-29  2:36 UTC (permalink / raw)
  To: Stefan I. Strogin
  Cc: linux-mm, linux-kernel, Joonsoo Kim, Andrew Morton,
	Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Dmitry Safonov, Pintu Kumar, Weijie Yang,
	Laura Abbott, SeongJae Park, Hui Zhu, Dyasly Sergey,
	Vyacheslav Tyrtov

Hello,

On Fri, Dec 26, 2014 at 05:39:01PM +0300, Stefan I. Strogin wrote:
> Hello all,
> 
> Here is a patch set that adds /proc/cmainfo.
> 
> When compiled with CONFIG_CMA_DEBUG /proc/cmainfo will contain information
> about about total, used, maximum free contiguous chunk and all currently
> allocated contiguous buffers in CMA regions. The information about allocated
> CMA buffers includes pid, comm, allocation latency and stacktrace at the
> moment of allocation.

It just says what you are doing but you didn't say why we need it.
I can guess but clear description(ie, the problem what you want to
solve with this patchset) would help others to review, for instance,
why we need latency, why we need callstack, why we need new wheel
rather than ftrace and so on.

Thanks.

> 
> Example:
> 
> # cat /proc/cmainfo 
> CMARegion stat:    65536 kB total,      248 kB used,    65216 kB max contiguous chunk
> 
> 0x32400000 - 0x32401000 (4 kB), allocated by pid 63 (systemd-udevd), latency 74 us
>  [<c1006e96>] dma_generic_alloc_coherent+0x86/0x160
>  [<c13093af>] rpm_idle+0x1f/0x1f0
>  [<c1006e10>] dma_generic_alloc_coherent+0x0/0x160
>  [<f80a533e>] ohci_init+0x1fe/0x430 [ohci_hcd]
>  [<c1006e10>] dma_generic_alloc_coherent+0x0/0x160
>  [<f801404f>] ohci_pci_reset+0x4f/0x60 [ohci_pci]
>  [<f80f165c>] usb_add_hcd+0x1fc/0x900 [usbcore]
>  [<c1256158>] pcibios_set_master+0x38/0x90
>  [<f8101ea6>] usb_hcd_pci_probe+0x176/0x4f0 [usbcore]
>  [<c125852f>] pci_device_probe+0x6f/0xd0
>  [<c1199495>] sysfs_create_link+0x25/0x50
>  [<c1300522>] driver_probe_device+0x92/0x3b0
>  [<c14564fb>] __mutex_lock_slowpath+0x5b/0x90
>  [<c1300880>] __driver_attach+0x0/0x80
>  [<c13008f9>] __driver_attach+0x79/0x80
>  [<c1300880>] __driver_attach+0x0/0x80
> 
> 0x32401000 - 0x32402000 (4 kB), allocated by pid 58 (systemd-udevd), latency 17 us
>  [<c130e370>] dmam_coherent_release+0x0/0x90
>  [<c112d76c>] __kmalloc_track_caller+0x31c/0x380
>  [<c1006e96>] dma_generic_alloc_coherent+0x86/0x160
>  [<c1006e10>] dma_generic_alloc_coherent+0x0/0x160
>  [<c130e226>] dmam_alloc_coherent+0xb6/0x100
>  [<f8125153>] ata_bmdma_port_start+0x43/0x60 [libata]
>  [<f8113068>] ata_host_start.part.29+0xb8/0x190 [libata]
>  [<c13624a0>] pci_read+0x30/0x40
>  [<f8124eb9>] ata_pci_sff_activate_host+0x29/0x220 [libata]
>  [<f8127050>] ata_bmdma_interrupt+0x0/0x1f0 [libata]
>  [<c1256158>] pcibios_set_master+0x38/0x90
>  [<f80ad9be>] piix_init_one+0x44e/0x630 [ata_piix]
>  [<c1455ef0>] mutex_lock+0x10/0x20
>  [<c1197093>] kernfs_activate+0x63/0xd0
>  [<c11971c3>] kernfs_add_one+0xc3/0x130
>  [<c125852f>] pci_device_probe+0x6f/0xd0
> <...>
> 
> Dmitry Safonov (1):
>   cma: add functions to get region pages counters
> 
> Stefan I. Strogin (2):
>   stacktrace: add seq_print_stack_trace()
>   mm: cma: introduce /proc/cmainfo
> 
>  include/linux/cma.h        |   2 +
>  include/linux/stacktrace.h |   4 +
>  kernel/stacktrace.c        |  17 ++++
>  mm/cma.c                   | 236 +++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 259 insertions(+)
> 
> -- 
> 2.1.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2014-12-29  2:36 ` [PATCH 0/3] mm: cma: /proc/cmainfo Minchan Kim
@ 2014-12-29 19:52   ` Laura Abbott
  2014-12-30  4:47     ` Minchan Kim
  2015-01-02  5:11   ` Pavel Machek
  1 sibling, 1 reply; 39+ messages in thread
From: Laura Abbott @ 2014-12-29 19:52 UTC (permalink / raw)
  To: Minchan Kim, Stefan I. Strogin
  Cc: linux-mm, linux-kernel, Joonsoo Kim, Andrew Morton,
	Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Dmitry Safonov, Pintu Kumar, Weijie Yang,
	SeongJae Park, Hui Zhu, Dyasly Sergey, Vyacheslav Tyrtov

On 12/28/2014 6:36 PM, Minchan Kim wrote:
> Hello,
>
> On Fri, Dec 26, 2014 at 05:39:01PM +0300, Stefan I. Strogin wrote:
>> Hello all,
>>
>> Here is a patch set that adds /proc/cmainfo.
>>
>> When compiled with CONFIG_CMA_DEBUG /proc/cmainfo will contain information
>> about about total, used, maximum free contiguous chunk and all currently
>> allocated contiguous buffers in CMA regions. The information about allocated
>> CMA buffers includes pid, comm, allocation latency and stacktrace at the
>> moment of allocation.
>
> It just says what you are doing but you didn't say why we need it.
> I can guess but clear description(ie, the problem what you want to
> solve with this patchset) would help others to review, for instance,
> why we need latency, why we need callstack, why we need new wheel
> rather than ftrace and so on.
>
> Thanks.
>

I've been meaning to write something like this for a while so I'm
happy to see an attempt made to fix this. I can't speak for the
author's reasons for wanting this information but there are
several reasons why I was thinking of something similar.

The most common bug reports seen internally on CMA are 1) CMA is
too slow and 2) CMA failed to allocate memory. For #1, not all
allocations may be slow so it's useful to be able to keep track
of which allocations are taking too long. For #2, migration
failure is fairly common but it's still important to rule out
a memory leak from a dma client. Seeing all the allocations is
also very useful for memory tuning (e.g. how big does the CMA
region need to be, which clients are actually allocating memory).

ftrace is certainly usable for tracing CMA allocation callers and
latency. ftrace is still only a fixed size buffer though so it's
possible for information to be lost if other logging is enabled.
For most of the CMA use cases, there is a very high cost if the
proper debugging information is not available so the more that
can be guaranteed the better.

It's also worth noting that the SLUB allocator has a sysfs
interface for showing allocation callers when CONFIG_SLUB_DEBUG
is enabled.

Thanks,
Laura

-- 
Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2014-12-29 19:52   ` Laura Abbott
@ 2014-12-30  4:47     ` Minchan Kim
  2014-12-30 22:00       ` Laura Abbott
                         ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Minchan Kim @ 2014-12-30  4:47 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Stefan I. Strogin, linux-mm, linux-kernel, Joonsoo Kim,
	Andrew Morton, Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Dmitry Safonov, Pintu Kumar, Weijie Yang,
	SeongJae Park, Hui Zhu, Dyasly Sergey, Vyacheslav Tyrtov, rostedt,
	namhyung

On Mon, Dec 29, 2014 at 11:52:58AM -0800, Laura Abbott wrote:
> On 12/28/2014 6:36 PM, Minchan Kim wrote:
> >Hello,
> >
> >On Fri, Dec 26, 2014 at 05:39:01PM +0300, Stefan I. Strogin wrote:
> >>Hello all,
> >>
> >>Here is a patch set that adds /proc/cmainfo.
> >>
> >>When compiled with CONFIG_CMA_DEBUG /proc/cmainfo will contain information
> >>about about total, used, maximum free contiguous chunk and all currently
> >>allocated contiguous buffers in CMA regions. The information about allocated
> >>CMA buffers includes pid, comm, allocation latency and stacktrace at the
> >>moment of allocation.
> >
> >It just says what you are doing but you didn't say why we need it.
> >I can guess but clear description(ie, the problem what you want to
> >solve with this patchset) would help others to review, for instance,
> >why we need latency, why we need callstack, why we need new wheel
> >rather than ftrace and so on.
> >
> >Thanks.
> >
> 
> 
> I've been meaning to write something like this for a while so I'm
> happy to see an attempt made to fix this. I can't speak for the
> author's reasons for wanting this information but there are
> several reasons why I was thinking of something similar.
> 
> The most common bug reports seen internally on CMA are 1) CMA is
> too slow and 2) CMA failed to allocate memory. For #1, not all
> allocations may be slow so it's useful to be able to keep track
> of which allocations are taking too long. For #2, migration

Then, I don't think we could keep all of allocations. What we need
is only slow allocations. I hope we can do that with ftrace.

ex)

# cd /sys/kernel/debug/tracing
# echo 1 > options/stacktrace
# echo cam_alloc > set_ftrace_filter
# echo your_threshold > tracing_thresh

I know it doesn't work now but I think it's more flexible
and general way to handle such issues(ie, latency of some functions).
So, I hope we could enhance ftrace rather than new wheel.
Ccing ftrace people.

Futhermore, if we really need to have such information, we need more data
(ex, how many of pages were migrated out, how many pages were dropped
without migrated, how many pages were written back, how many pages were
retried with the page lock and so on).
In this case, event trace would be better.


> failure is fairly common but it's still important to rule out
> a memory leak from a dma client. Seeing all the allocations is
> also very useful for memory tuning (e.g. how big does the CMA
> region need to be, which clients are actually allocating memory).

Memory leak is really general problem and could we handle it with
page_owner?

> 
> ftrace is certainly usable for tracing CMA allocation callers and
> latency. ftrace is still only a fixed size buffer though so it's
> possible for information to be lost if other logging is enabled.

Sorry, I don't get with only above reasons why we need this. :(

> For most of the CMA use cases, there is a very high cost if the
> proper debugging information is not available so the more that
> can be guaranteed the better.
> 
> It's also worth noting that the SLUB allocator has a sysfs
> interface for showing allocation callers when CONFIG_SLUB_DEBUG
> is enabled.
> 
> Thanks,
> Laura
> 
> -- 
> Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2014-12-30  4:47     ` Minchan Kim
@ 2014-12-30 22:00       ` Laura Abbott
  2014-12-31  0:25         ` Minchan Kim
  2014-12-31  0:58       ` Gioh Kim
  2015-01-09 14:19       ` Steven Rostedt
  2 siblings, 1 reply; 39+ messages in thread
From: Laura Abbott @ 2014-12-30 22:00 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Stefan I. Strogin, linux-mm, linux-kernel, Joonsoo Kim,
	Andrew Morton, Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Dmitry Safonov, Pintu Kumar, Weijie Yang,
	SeongJae Park, Hui Zhu, Dyasly Sergey, Vyacheslav Tyrtov, rostedt,
	namhyung

On 12/29/2014 8:47 PM, Minchan Kim wrote:
>>
>>
>> I've been meaning to write something like this for a while so I'm
>> happy to see an attempt made to fix this. I can't speak for the
>> author's reasons for wanting this information but there are
>> several reasons why I was thinking of something similar.
>>
>> The most common bug reports seen internally on CMA are 1) CMA is
>> too slow and 2) CMA failed to allocate memory. For #1, not all
>> allocations may be slow so it's useful to be able to keep track
>> of which allocations are taking too long. For #2, migration
>
> Then, I don't think we could keep all of allocations. What we need
> is only slow allocations. I hope we can do that with ftrace.
>
> ex)
>
> # cd /sys/kernel/debug/tracing
> # echo 1 > options/stacktrace
> # echo cam_alloc > set_ftrace_filter
> # echo your_threshold > tracing_thresh
>
> I know it doesn't work now but I think it's more flexible
> and general way to handle such issues(ie, latency of some functions).
> So, I hope we could enhance ftrace rather than new wheel.
> Ccing ftrace people.
>
> Futhermore, if we really need to have such information, we need more data
> (ex, how many of pages were migrated out, how many pages were dropped
> without migrated, how many pages were written back, how many pages were
> retried with the page lock and so on).
> In this case, event trace would be better.
>
>

I agree ftrace is significantly more flexible in many respects but
for the type of information we're actually trying to collect here
ftrace may not be the right tool. Often times it won't be obvious there
will be a problem when starting a test so all debugging information
needs to be enabled. If the debugging information needs to be on
almost all the time anyway it seems silly to allow it be configurable
via ftrace.

>> failure is fairly common but it's still important to rule out
>> a memory leak from a dma client. Seeing all the allocations is
>> also very useful for memory tuning (e.g. how big does the CMA
>> region need to be, which clients are actually allocating memory).
>
> Memory leak is really general problem and could we handle it with
> page_owner?
>

True, but it gets difficult to narrow down which are CMA pages allocated
via the contiguous code path. page owner also can't differentiate between
different CMA regions, this needs to be done separately. This may
be a sign page owner needs some extensions independent of any CMA
work.

>>
>> ftrace is certainly usable for tracing CMA allocation callers and
>> latency. ftrace is still only a fixed size buffer though so it's
>> possible for information to be lost if other logging is enabled.
>
> Sorry, I don't get with only above reasons why we need this. :(
>

I guess from my perspective the problem that is being solved here
is a fairly fixed static problem. We know the information we always
want to collect and have available so the ability to turn it off
and on via ftrace doesn't seem necessary. The ftrace maintainers
will probably disagree here but doing 'cat foo' on a file is
easier than finding the particular events, setting thresholds,
collecting the trace and possibly post processing. It seems like
this is conflating tracing which ftrace does very well with getting
a snapshot of the system at a fixed point in time which is what
debugfs files are designed for. We really just want a snapshot of
allocation history with some information about those allocations.
There should be more ftrace events in the CMA path but I think those
should be in supplement to the debugfs interface and not a replacement.

Thanks,
Laura

-- 
Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2014-12-30 22:00       ` Laura Abbott
@ 2014-12-31  0:25         ` Minchan Kim
  2015-01-21 13:52           ` Stefan Strogin
  0 siblings, 1 reply; 39+ messages in thread
From: Minchan Kim @ 2014-12-31  0:25 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Stefan I. Strogin, linux-mm, linux-kernel, Joonsoo Kim,
	Andrew Morton, Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Dmitry Safonov, Pintu Kumar, Weijie Yang,
	SeongJae Park, Hui Zhu, Dyasly Sergey, Vyacheslav Tyrtov, rostedt,
	namhyung

On Tue, Dec 30, 2014 at 02:00:27PM -0800, Laura Abbott wrote:
> On 12/29/2014 8:47 PM, Minchan Kim wrote:
> >>
> >>
> >>I've been meaning to write something like this for a while so I'm
> >>happy to see an attempt made to fix this. I can't speak for the
> >>author's reasons for wanting this information but there are
> >>several reasons why I was thinking of something similar.
> >>
> >>The most common bug reports seen internally on CMA are 1) CMA is
> >>too slow and 2) CMA failed to allocate memory. For #1, not all
> >>allocations may be slow so it's useful to be able to keep track
> >>of which allocations are taking too long. For #2, migration
> >
> >Then, I don't think we could keep all of allocations. What we need
> >is only slow allocations. I hope we can do that with ftrace.
> >
> >ex)
> >
> ># cd /sys/kernel/debug/tracing
> ># echo 1 > options/stacktrace
> ># echo cam_alloc > set_ftrace_filter
> ># echo your_threshold > tracing_thresh
> >
> >I know it doesn't work now but I think it's more flexible
> >and general way to handle such issues(ie, latency of some functions).
> >So, I hope we could enhance ftrace rather than new wheel.
> >Ccing ftrace people.
> >
> >Futhermore, if we really need to have such information, we need more data
> >(ex, how many of pages were migrated out, how many pages were dropped
> >without migrated, how many pages were written back, how many pages were
> >retried with the page lock and so on).
> >In this case, event trace would be better.
> >
> >
> 
> I agree ftrace is significantly more flexible in many respects but
> for the type of information we're actually trying to collect here
> ftrace may not be the right tool. Often times it won't be obvious there
> will be a problem when starting a test so all debugging information
> needs to be enabled. If the debugging information needs to be on
> almost all the time anyway it seems silly to allow it be configurable
> via ftrace.

There is a trade off. Instead, ftrace will collect the information
with small overhead in runtime, even alomost zero-overhead when
we turns off so we could investigate the problem in live machine
without rebooting/rebuiling.

If the problem you are trying to solve is latency, I think ftrace
with more data(ie, # of migrated page, # of stall by dirty or
locking and so on) would be better. As current interface, something
did cma_alloc which was really slow but it just cma_release right before
we looked at the /proc/cmainfo. In that case, we will miss the
information. It means someone should poll cmainfo continuously to
avoid the missing so we should make reading part of cmainfo fast
or make notification mechanism or keep only top 10 entries.

Let's think as different view. If we know via cmainfo some function
was slow, it's always true? Slowness of cma_alloc depends migration
latency as well as cma region fragmentation so the function which
was really slow would be fast in future if we are luck while
fast function in old could be slower in future if there are lots of
dirty pages or small small contiguous space in CMA region.
I mean some funcion itself is slow or fast is not a important parameter
to pinpoint cma's problem if we should take care of CMA's slowness or
failing.

Anyway, I'm not saying I don't want to add any debug facility
to CMA. My point is this patchset doesn't say why author need it
so it's hard to review the code. Depending on the problem author
is looking, we should review what kinds of data, what kinds of
interface, what kinds of implementation need.
So please say more specific rather than just having better.

> 
> >>failure is fairly common but it's still important to rule out
> >>a memory leak from a dma client. Seeing all the allocations is
> >>also very useful for memory tuning (e.g. how big does the CMA
> >>region need to be, which clients are actually allocating memory).
> >
> >Memory leak is really general problem and could we handle it with
> >page_owner?
> >
> 
> True, but it gets difficult to narrow down which are CMA pages allocated
> via the contiguous code path. page owner also can't differentiate between

I don't get it. The page_owner provides backtrace so why is it hard
to parse contiguous code path?

> different CMA regions, this needs to be done separately. This may

Page owner just report PFN and we know which pfn range is any CMA regions
so can't we do postprocessing?

> be a sign page owner needs some extensions independent of any CMA
> work.
> >>
> >>ftrace is certainly usable for tracing CMA allocation callers and
> >>latency. ftrace is still only a fixed size buffer though so it's
> >>possible for information to be lost if other logging is enabled.
> >
> >Sorry, I don't get with only above reasons why we need this. :(
> >
> 
> I guess from my perspective the problem that is being solved here
> is a fairly fixed static problem. We know the information we always
> want to collect and have available so the ability to turn it off
> and on via ftrace doesn't seem necessary. The ftrace maintainers
> will probably disagree here but doing 'cat foo' on a file is
> easier than finding the particular events, setting thresholds,
> collecting the trace and possibly post processing. It seems like
> this is conflating tracing which ftrace does very well with getting
> a snapshot of the system at a fixed point in time which is what
> debugfs files are designed for. We really just want a snapshot of
> allocation history with some information about those allocations.
> There should be more ftrace events in the CMA path but I think those
> should be in supplement to the debugfs interface and not a replacement.

Again say, please more specific what kinds of problem you want to solve.
If it includes several problems(you said latency, leak), please
divide the patchset to solve each problem. Without it, there is no worth
to dive into code.

> 
> Thanks,
> Laura
> 
> -- 
> Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2014-12-31  0:25         ` Minchan Kim
@ 2015-01-21 13:52           ` Stefan Strogin
  2015-01-23  6:33             ` Joonsoo Kim
  0 siblings, 1 reply; 39+ messages in thread
From: Stefan Strogin @ 2015-01-21 13:52 UTC (permalink / raw)
  To: Minchan Kim, Laura Abbott
  Cc: linux-mm, linux-kernel, Joonsoo Kim, Andrew Morton,
	Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Dmitry Safonov, Pintu Kumar, Weijie Yang,
	SeongJae Park, Hui Zhu, Dyasly Sergey, Vyacheslav Tyrtov, rostedt,
	namhyung, s.strogin, stefan.strogin

Sorry for such a long delay. Now I'll try to answer all the questions
and make a second version.

The original reason of why we need a new debugging tool for CMA is
written by Minchan (http://www.spinics.net/lists/linux-mm/msg81519.html):
> 3. CMA allocation latency -> Broken
> 4. CMA allocation success guarantee -> Broken.

We have no acceptable solution for these problems yet. We use CMA in our
devices. But currently for lack of allocation guarantee there are some
memory buffers that are always allocated at boot time even if they're
not used. However we'd like to allocate contiguous buffers in runtime as
much as it's possible.

First we want an interface like /proc/vmallocinfo to see that all needed
contiguous buffers are allocated correctly, used/free memory in CMA
regions (like in /proc/meminfo) and also CMA region's fragmentation.
Stacktrace is used to see who and whence allocated each buffer. Since
vmallocinfo and meminfo are located in /proc I thought that cmainfo
should be in /proc too. Maybe latency is really unnecessary here (see
hereinafter).


Second (not implemented yet) we want to debug 3) and 4) (especially in
case of allocating in runtime). One of the main reasons of failed and
slow allocations is pinning A<<movableA>> pages for a long time, so they
can't be moved (SeongJae Park described it:
http://lwn.net/Articles/619865/).
To debug such cases we want to know for each allocation (for failed ones
as well) its latency and some information about page migration, e.g. the
number of pages that couldn't be migrated, why and page_owner's
information for pages that failed to be migrated.

To my mind this might help us to identify subsystems that pin pages for
too long in order to make such subsystems allocate only unmovable pages
or fix the long-time page pinning.

The last thing should be done in debugfs of course. Maybe something like
this, I'm not sure:
# cd /sys/kernel/debug/cma/<N>/
# ls
allocated failed migration_stat released (...)
# cat failed
0x32400000 - 0x32406000 (24 kB), allocated by pid 63 (systemd-udevd),
time spent 9000 us
pages migrations required: 4
succeeded [by the last try in __alloc_contig_migrate_range()]: 2
failed/given up [on the last try in __alloc_contig_migrate_range()]: 2,
page_owner's information for pages that couldn't be migrated.

# cat migration_stat
Total pages migration requests: 1000
Pages migrated successfully: 900
Pages migration give-ups: 80
Pages migration failures: 20
Average tries per successful migration: 1.89
(some other useful information)


On 12/31/2014 03:25 AM, Minchan Kim wrote:
> On Tue, Dec 30, 2014 at 02:00:27PM -0800, Laura Abbott wrote:
>> On 12/29/2014 8:47 PM, Minchan Kim wrote:
>>>> I've been meaning to write something like this for a while so I'm
>>>> happy to see an attempt made to fix this. I can't speak for the
>>>> author's reasons for wanting this information but there are
>>>> several reasons why I was thinking of something similar.
>>>>
>>>> The most common bug reports seen internally on CMA are 1) CMA is
>>>> too slow and 2) CMA failed to allocate memory. For #1, not all
>>>> allocations may be slow so it's useful to be able to keep track
>>>> of which allocations are taking too long. For #2, migration
>>> Then, I don't think we could keep all of allocations. What we need
>>> is only slow allocations. I hope we can do that with ftrace.
>>>
>>> ex)
>>>
>>> # cd /sys/kernel/debug/tracing
>>> # echo 1 > options/stacktrace
>>> # echo cam_alloc > set_ftrace_filter
>>> # echo your_threshold > tracing_thresh
>>>
>>> I know it doesn't work now but I think it's more flexible
>>> and general way to handle such issues(ie, latency of some functions).
>>> So, I hope we could enhance ftrace rather than new wheel.
>>> Ccing ftrace people.
>>>
>>> Futhermore, if we really need to have such information, we need more data
>>> (ex, how many of pages were migrated out, how many pages were dropped
>>> without migrated, how many pages were written back, how many pages were
>>> retried with the page lock and so on).
>>> In this case, event trace would be better.
>>>
>>>
>> I agree ftrace is significantly more flexible in many respects but
>> for the type of information we're actually trying to collect here
>> ftrace may not be the right tool. Often times it won't be obvious there
>> will be a problem when starting a test so all debugging information
>> needs to be enabled. If the debugging information needs to be on
>> almost all the time anyway it seems silly to allow it be configurable
>> via ftrace.
> There is a trade off. Instead, ftrace will collect the information
> with small overhead in runtime, even alomost zero-overhead when
> we turns off so we could investigate the problem in live machine
> without rebooting/rebuiling.
>
> If the problem you are trying to solve is latency, I think ftrace
> with more data(ie, # of migrated page, # of stall by dirty or
> locking and so on) would be better. As current interface, something
> did cma_alloc which was really slow but it just cma_release right before
> we looked at the /proc/cmainfo. In that case, we will miss the
> information. It means someone should poll cmainfo continuously to
> avoid the missing so we should make reading part of cmainfo fast
> or make notification mechanism or keep only top 10 entries.
>
> Let's think as different view. If we know via cmainfo some function
> was slow, it's always true? Slowness of cma_alloc depends migration
> latency as well as cma region fragmentation so the function which
> was really slow would be fast in future if we are luck while
> fast function in old could be slower in future if there are lots of
> dirty pages or small small contiguous space in CMA region.
> I mean some funcion itself is slow or fast is not a important parameter
> to pinpoint cma's problem if we should take care of CMA's slowness or
> failing.
>
> Anyway, I'm not saying I don't want to add any debug facility
> to CMA. My point is this patchset doesn't say why author need it
> so it's hard to review the code. Depending on the problem author
> is looking, we should review what kinds of data, what kinds of
> interface, what kinds of implementation need.
> So please say more specific rather than just having better.
>
>>>> failure is fairly common but it's still important to rule out
>>>> a memory leak from a dma client. Seeing all the allocations is
>>>> also very useful for memory tuning (e.g. how big does the CMA
>>>> region need to be, which clients are actually allocating memory).
>>> Memory leak is really general problem and could we handle it with
>>> page_owner?
>>>
>> True, but it gets difficult to narrow down which are CMA pages allocated
>> via the contiguous code path. page owner also can't differentiate between
> I don't get it. The page_owner provides backtrace so why is it hard
> to parse contiguous code path?
>
>> different CMA regions, this needs to be done separately. This may
> Page owner just report PFN and we know which pfn range is any CMA regions
> so can't we do postprocessing?
>
>> be a sign page owner needs some extensions independent of any CMA
>> work.
>>>> ftrace is certainly usable for tracing CMA allocation callers and
>>>> latency. ftrace is still only a fixed size buffer though so it's
>>>> possible for information to be lost if other logging is enabled.
>>> Sorry, I don't get with only above reasons why we need this. :(
>>>
>> I guess from my perspective the problem that is being solved here
>> is a fairly fixed static problem. We know the information we always
>> want to collect and have available so the ability to turn it off
>> and on via ftrace doesn't seem necessary. The ftrace maintainers
>> will probably disagree here but doing 'cat foo' on a file is
>> easier than finding the particular events, setting thresholds,
>> collecting the trace and possibly post processing. It seems like
>> this is conflating tracing which ftrace does very well with getting
>> a snapshot of the system at a fixed point in time which is what
>> debugfs files are designed for. We really just want a snapshot of
>> allocation history with some information about those allocations.
>> There should be more ftrace events in the CMA path but I think those
>> should be in supplement to the debugfs interface and not a replacement.
> Again say, please more specific what kinds of problem you want to solve.
> If it includes several problems(you said latency, leak), please
> divide the patchset to solve each problem. Without it, there is no worth
> to dive into code.
>
>> Thanks,
>> Laura
>>
>> -- 
>> Qualcomm Innovation Center, Inc.
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> a Linux Foundation Collaborative Project
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2015-01-21 13:52           ` Stefan Strogin
@ 2015-01-23  6:33             ` Joonsoo Kim
  0 siblings, 0 replies; 39+ messages in thread
From: Joonsoo Kim @ 2015-01-23  6:33 UTC (permalink / raw)
  To: Stefan Strogin
  Cc: Minchan Kim, Laura Abbott, linux-mm, linux-kernel, Andrew Morton,
	Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Dmitry Safonov, Pintu Kumar, Weijie Yang,
	SeongJae Park, Hui Zhu, Dyasly Sergey, Vyacheslav Tyrtov, rostedt,
	namhyung, stefan.strogin

On Wed, Jan 21, 2015 at 04:52:36PM +0300, Stefan Strogin wrote:
> Sorry for such a long delay. Now I'll try to answer all the questions
> and make a second version.
> 
> The original reason of why we need a new debugging tool for CMA is
> written by Minchan (http://www.spinics.net/lists/linux-mm/msg81519.html):
> > 3. CMA allocation latency -> Broken
> > 4. CMA allocation success guarantee -> Broken.
> 
> We have no acceptable solution for these problems yet. We use CMA in our
> devices. But currently for lack of allocation guarantee there are some
> memory buffers that are always allocated at boot time even if they're
> not used. However we'd like to allocate contiguous buffers in runtime as
> much as it's possible.
> 
> First we want an interface like /proc/vmallocinfo to see that all needed
> contiguous buffers are allocated correctly, used/free memory in CMA
> regions (like in /proc/meminfo) and also CMA region's fragmentation.

Hello,

I agree that we need some information to debug or improve CMA.

But, why these complicate data structure in your code are needed for
information like as vmallocinfo? Just printing bitmap of struct cma seems
sufficient to me to check alignment and fragmentation problem.

> Stacktrace is used to see who and whence allocated each buffer. Since
> vmallocinfo and meminfo are located in /proc I thought that cmainfo
> should be in /proc too. Maybe latency is really unnecessary here (see
> hereinafter).

I guess that adding some tracepoints on alloc/free functions could
accomplish your purpose. They can print vairous information you want
and can also print stacktrace.

Thanks.

> 
> Second (not implemented yet) we want to debug 3) and 4) (especially in
> case of allocating in runtime). One of the main reasons of failed and
> slow allocations is pinning <<movable>> pages for a long time, so they
> can't be moved (SeongJae Park described it:
> http://lwn.net/Articles/619865/).
> To debug such cases we want to know for each allocation (for failed ones
> as well) its latency and some information about page migration, e.g. the
> number of pages that couldn't be migrated, why and page_owner's
> information for pages that failed to be migrated.
> 
> To my mind this might help us to identify subsystems that pin pages for
> too long in order to make such subsystems allocate only unmovable pages
> or fix the long-time page pinning.
> 
> The last thing should be done in debugfs of course. Maybe something like
> this, I'm not sure:
> # cd /sys/kernel/debug/cma/<N>/
> # ls
> allocated failed migration_stat released (...)
> # cat failed
> 0x32400000 - 0x32406000 (24 kB), allocated by pid 63 (systemd-udevd),
> time spent 9000 us
> pages migrations required: 4
> succeeded [by the last try in __alloc_contig_migrate_range()]: 2
> failed/given up [on the last try in __alloc_contig_migrate_range()]: 2,
> page_owner's information for pages that couldn't be migrated.
> 
> # cat migration_stat
> Total pages migration requests: 1000
> Pages migrated successfully: 900
> Pages migration give-ups: 80
> Pages migration failures: 20
> Average tries per successful migration: 1.89
> (some other useful information)
> 
> 
> On 12/31/2014 03:25 AM, Minchan Kim wrote:
> > On Tue, Dec 30, 2014 at 02:00:27PM -0800, Laura Abbott wrote:
> >> On 12/29/2014 8:47 PM, Minchan Kim wrote:
> >>>> I've been meaning to write something like this for a while so I'm
> >>>> happy to see an attempt made to fix this. I can't speak for the
> >>>> author's reasons for wanting this information but there are
> >>>> several reasons why I was thinking of something similar.
> >>>>
> >>>> The most common bug reports seen internally on CMA are 1) CMA is
> >>>> too slow and 2) CMA failed to allocate memory. For #1, not all
> >>>> allocations may be slow so it's useful to be able to keep track
> >>>> of which allocations are taking too long. For #2, migration
> >>> Then, I don't think we could keep all of allocations. What we need
> >>> is only slow allocations. I hope we can do that with ftrace.
> >>>
> >>> ex)
> >>>
> >>> # cd /sys/kernel/debug/tracing
> >>> # echo 1 > options/stacktrace
> >>> # echo cam_alloc > set_ftrace_filter
> >>> # echo your_threshold > tracing_thresh
> >>>
> >>> I know it doesn't work now but I think it's more flexible
> >>> and general way to handle such issues(ie, latency of some functions).
> >>> So, I hope we could enhance ftrace rather than new wheel.
> >>> Ccing ftrace people.
> >>>
> >>> Futhermore, if we really need to have such information, we need more data
> >>> (ex, how many of pages were migrated out, how many pages were dropped
> >>> without migrated, how many pages were written back, how many pages were
> >>> retried with the page lock and so on).
> >>> In this case, event trace would be better.
> >>>
> >>>
> >> I agree ftrace is significantly more flexible in many respects but
> >> for the type of information we're actually trying to collect here
> >> ftrace may not be the right tool. Often times it won't be obvious there
> >> will be a problem when starting a test so all debugging information
> >> needs to be enabled. If the debugging information needs to be on
> >> almost all the time anyway it seems silly to allow it be configurable
> >> via ftrace.
> > There is a trade off. Instead, ftrace will collect the information
> > with small overhead in runtime, even alomost zero-overhead when
> > we turns off so we could investigate the problem in live machine
> > without rebooting/rebuiling.
> >
> > If the problem you are trying to solve is latency, I think ftrace
> > with more data(ie, # of migrated page, # of stall by dirty or
> > locking and so on) would be better. As current interface, something
> > did cma_alloc which was really slow but it just cma_release right before
> > we looked at the /proc/cmainfo. In that case, we will miss the
> > information. It means someone should poll cmainfo continuously to
> > avoid the missing so we should make reading part of cmainfo fast
> > or make notification mechanism or keep only top 10 entries.
> >
> > Let's think as different view. If we know via cmainfo some function
> > was slow, it's always true? Slowness of cma_alloc depends migration
> > latency as well as cma region fragmentation so the function which
> > was really slow would be fast in future if we are luck while
> > fast function in old could be slower in future if there are lots of
> > dirty pages or small small contiguous space in CMA region.
> > I mean some funcion itself is slow or fast is not a important parameter
> > to pinpoint cma's problem if we should take care of CMA's slowness or
> > failing.
> >
> > Anyway, I'm not saying I don't want to add any debug facility
> > to CMA. My point is this patchset doesn't say why author need it
> > so it's hard to review the code. Depending on the problem author
> > is looking, we should review what kinds of data, what kinds of
> > interface, what kinds of implementation need.
> > So please say more specific rather than just having better.
> >
> >>>> failure is fairly common but it's still important to rule out
> >>>> a memory leak from a dma client. Seeing all the allocations is
> >>>> also very useful for memory tuning (e.g. how big does the CMA
> >>>> region need to be, which clients are actually allocating memory).
> >>> Memory leak is really general problem and could we handle it with
> >>> page_owner?
> >>>
> >> True, but it gets difficult to narrow down which are CMA pages allocated
> >> via the contiguous code path. page owner also can't differentiate between
> > I don't get it. The page_owner provides backtrace so why is it hard
> > to parse contiguous code path?
> >
> >> different CMA regions, this needs to be done separately. This may
> > Page owner just report PFN and we know which pfn range is any CMA regions
> > so can't we do postprocessing?
> >
> >> be a sign page owner needs some extensions independent of any CMA
> >> work.
> >>>> ftrace is certainly usable for tracing CMA allocation callers and
> >>>> latency. ftrace is still only a fixed size buffer though so it's
> >>>> possible for information to be lost if other logging is enabled.
> >>> Sorry, I don't get with only above reasons why we need this. :(
> >>>
> >> I guess from my perspective the problem that is being solved here
> >> is a fairly fixed static problem. We know the information we always
> >> want to collect and have available so the ability to turn it off
> >> and on via ftrace doesn't seem necessary. The ftrace maintainers
> >> will probably disagree here but doing 'cat foo' on a file is
> >> easier than finding the particular events, setting thresholds,
> >> collecting the trace and possibly post processing. It seems like
> >> this is conflating tracing which ftrace does very well with getting
> >> a snapshot of the system at a fixed point in time which is what
> >> debugfs files are designed for. We really just want a snapshot of
> >> allocation history with some information about those allocations.
> >> There should be more ftrace events in the CMA path but I think those
> >> should be in supplement to the debugfs interface and not a replacement.
> > Again say, please more specific what kinds of problem you want to solve.
> > If it includes several problems(you said latency, leak), please
> > divide the patchset to solve each problem. Without it, there is no worth
> > to dive into code.
> >
> >> Thanks,
> >> Laura
> >>
> >> -- 
> >> Qualcomm Innovation Center, Inc.
> >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> >> a Linux Foundation Collaborative Project
> >>
> >> --
> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >> the body to majordomo@kvack.org.  For more info on Linux MM,
> >> see: http://www.linux-mm.org/ .
> >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2014-12-30  4:47     ` Minchan Kim
  2014-12-30 22:00       ` Laura Abbott
@ 2014-12-31  0:58       ` Gioh Kim
  2014-12-31  2:18         ` Minchan Kim
  2014-12-31  6:47         ` Namhyung Kim
  2015-01-09 14:19       ` Steven Rostedt
  2 siblings, 2 replies; 39+ messages in thread
From: Gioh Kim @ 2014-12-31  0:58 UTC (permalink / raw)
  To: Minchan Kim, Laura Abbott
  Cc: Stefan I. Strogin, linux-mm, linux-kernel, Joonsoo Kim,
	Andrew Morton, Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Dmitry Safonov, Pintu Kumar, Weijie Yang,
	SeongJae Park, Hui Zhu, Dyasly Sergey, Vyacheslav Tyrtov, rostedt,
	namhyung



2014-12-30 i??i?? 1:47i?? Minchan Kim i?'(e??) i?' e,?:
> On Mon, Dec 29, 2014 at 11:52:58AM -0800, Laura Abbott wrote:
>> On 12/28/2014 6:36 PM, Minchan Kim wrote:
>>> Hello,
>>>
>>> On Fri, Dec 26, 2014 at 05:39:01PM +0300, Stefan I. Strogin wrote:
>>>> Hello all,
>>>>
>>>> Here is a patch set that adds /proc/cmainfo.
>>>>
>>>> When compiled with CONFIG_CMA_DEBUG /proc/cmainfo will contain information
>>>> about about total, used, maximum free contiguous chunk and all currently
>>>> allocated contiguous buffers in CMA regions. The information about allocated
>>>> CMA buffers includes pid, comm, allocation latency and stacktrace at the
>>>> moment of allocation.
>>>
>>> It just says what you are doing but you didn't say why we need it.
>>> I can guess but clear description(ie, the problem what you want to
>>> solve with this patchset) would help others to review, for instance,
>>> why we need latency, why we need callstack, why we need new wheel
>>> rather than ftrace and so on.
>>>
>>> Thanks.
>>>
>>
>>
>> I've been meaning to write something like this for a while so I'm
>> happy to see an attempt made to fix this. I can't speak for the
>> author's reasons for wanting this information but there are
>> several reasons why I was thinking of something similar.
>>
>> The most common bug reports seen internally on CMA are 1) CMA is
>> too slow and 2) CMA failed to allocate memory. For #1, not all
>> allocations may be slow so it's useful to be able to keep track
>> of which allocations are taking too long. For #2, migration
>
> Then, I don't think we could keep all of allocations. What we need
> is only slow allocations. I hope we can do that with ftrace.
>
> ex)
>
> # cd /sys/kernel/debug/tracing
> # echo 1 > options/stacktrace
> # echo cam_alloc > set_ftrace_filter
> # echo your_threshold > tracing_thresh
>
> I know it doesn't work now but I think it's more flexible
> and general way to handle such issues(ie, latency of some functions).
> So, I hope we could enhance ftrace rather than new wheel.
> Ccing ftrace people.

For CMA performance test or code flow check, ftrace is better.

ex)
echo cma_alloc > /sys/kernel/debug/tracing/set_graph_function
echo function_graph > /sys/kernel/debug/tracing/current_tracer
echo funcgraph-proc > /sys/kernel/debug/tracing/trace_options
echo nosleep-time > /sys/kernel/debug/tracing/trace_options
echo funcgraph-tail > /sys/kernel/debug/tracing/trace_options
echo 1 > /sys/kernel/debug/tracing/tracing_on

This can trace every cam_alloc and allocation time.
I think ftrace is better to debug latency.
If a buffer had allocated and had peak latency and freed,
we can check it.

But ftrace doesn't provide current status how many buffers we have and what address it is.
So I think debugging information is useful.



>
> Futhermore, if we really need to have such information, we need more data
> (ex, how many of pages were migrated out, how many pages were dropped
> without migrated, how many pages were written back, how many pages were
> retried with the page lock and so on).
> In this case, event trace would be better.
>
>
>> failure is fairly common but it's still important to rule out
>> a memory leak from a dma client. Seeing all the allocations is
>> also very useful for memory tuning (e.g. how big does the CMA
>> region need to be, which clients are actually allocating memory).
>
> Memory leak is really general problem and could we handle it with
> page_owner?
>
>>
>> ftrace is certainly usable for tracing CMA allocation callers and
>> latency. ftrace is still only a fixed size buffer though so it's
>> possible for information to be lost if other logging is enabled.
>
> Sorry, I don't get with only above reasons why we need this. :(
>
>> For most of the CMA use cases, there is a very high cost if the
>> proper debugging information is not available so the more that
>> can be guaranteed the better.
>>
>> It's also worth noting that the SLUB allocator has a sysfs
>> interface for showing allocation callers when CONFIG_SLUB_DEBUG
>> is enabled.
>>
>> Thanks,
>> Laura
>>
>> --
>> Qualcomm Innovation Center, Inc.
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> a Linux Foundation Collaborative Project
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2014-12-31  0:58       ` Gioh Kim
@ 2014-12-31  2:18         ` Minchan Kim
  2014-12-31  2:45           ` Gioh Kim
  2014-12-31  6:47         ` Namhyung Kim
  1 sibling, 1 reply; 39+ messages in thread
From: Minchan Kim @ 2014-12-31  2:18 UTC (permalink / raw)
  To: Gioh Kim
  Cc: Laura Abbott, Stefan I. Strogin, linux-mm, linux-kernel,
	Joonsoo Kim, Andrew Morton, Marek Szyprowski, Michal Nazarewicz,
	aneesh.kumar, Laurent Pinchart, Dmitry Safonov, Pintu Kumar,
	Weijie Yang, SeongJae Park, Hui Zhu, Dyasly Sergey,
	Vyacheslav Tyrtov, rostedt, namhyung

Hey, Gioh

On Wed, Dec 31, 2014 at 09:58:04AM +0900, Gioh Kim wrote:
> 
> 
> 2014-12-30 i??i?? 1:47i?? Minchan Kim i?'(e??) i?' e,?:
> >On Mon, Dec 29, 2014 at 11:52:58AM -0800, Laura Abbott wrote:
> >>On 12/28/2014 6:36 PM, Minchan Kim wrote:
> >>>Hello,
> >>>
> >>>On Fri, Dec 26, 2014 at 05:39:01PM +0300, Stefan I. Strogin wrote:
> >>>>Hello all,
> >>>>
> >>>>Here is a patch set that adds /proc/cmainfo.
> >>>>
> >>>>When compiled with CONFIG_CMA_DEBUG /proc/cmainfo will contain information
> >>>>about about total, used, maximum free contiguous chunk and all currently
> >>>>allocated contiguous buffers in CMA regions. The information about allocated
> >>>>CMA buffers includes pid, comm, allocation latency and stacktrace at the
> >>>>moment of allocation.
> >>>
> >>>It just says what you are doing but you didn't say why we need it.
> >>>I can guess but clear description(ie, the problem what you want to
> >>>solve with this patchset) would help others to review, for instance,
> >>>why we need latency, why we need callstack, why we need new wheel
> >>>rather than ftrace and so on.
> >>>
> >>>Thanks.
> >>>
> >>
> >>
> >>I've been meaning to write something like this for a while so I'm
> >>happy to see an attempt made to fix this. I can't speak for the
> >>author's reasons for wanting this information but there are
> >>several reasons why I was thinking of something similar.
> >>
> >>The most common bug reports seen internally on CMA are 1) CMA is
> >>too slow and 2) CMA failed to allocate memory. For #1, not all
> >>allocations may be slow so it's useful to be able to keep track
> >>of which allocations are taking too long. For #2, migration
> >
> >Then, I don't think we could keep all of allocations. What we need
> >is only slow allocations. I hope we can do that with ftrace.
> >
> >ex)
> >
> ># cd /sys/kernel/debug/tracing
> ># echo 1 > options/stacktrace
> ># echo cam_alloc > set_ftrace_filter
> ># echo your_threshold > tracing_thresh
> >
> >I know it doesn't work now but I think it's more flexible
> >and general way to handle such issues(ie, latency of some functions).
> >So, I hope we could enhance ftrace rather than new wheel.
> >Ccing ftrace people.
> 
> For CMA performance test or code flow check, ftrace is better.
> 
> ex)
> echo cma_alloc > /sys/kernel/debug/tracing/set_graph_function
> echo function_graph > /sys/kernel/debug/tracing/current_tracer
> echo funcgraph-proc > /sys/kernel/debug/tracing/trace_options
> echo nosleep-time > /sys/kernel/debug/tracing/trace_options
> echo funcgraph-tail > /sys/kernel/debug/tracing/trace_options
> echo 1 > /sys/kernel/debug/tracing/tracing_on

I didn't know such detail. Thanks for the tip, Gioh.

> 
> This can trace every cam_alloc and allocation time.
> I think ftrace is better to debug latency.
> If a buffer had allocated and had peak latency and freed,
> we can check it.

Agree.

> 
> But ftrace doesn't provide current status how many buffers we have and what address it is.
> So I think debugging information is useful.

I didn't say debug information is useless.
If we need to know snapshot of cma at the moment,
describe why we need it and send a patch to implement the idea
rather than dumping lots of information is always better.

> 
> 
> 
> >
> >Futhermore, if we really need to have such information, we need more data
> >(ex, how many of pages were migrated out, how many pages were dropped
> >without migrated, how many pages were written back, how many pages were
> >retried with the page lock and so on).
> >In this case, event trace would be better.
> >
> >
> >>failure is fairly common but it's still important to rule out
> >>a memory leak from a dma client. Seeing all the allocations is
> >>also very useful for memory tuning (e.g. how big does the CMA
> >>region need to be, which clients are actually allocating memory).
> >
> >Memory leak is really general problem and could we handle it with
> >page_owner?
> >
> >>
> >>ftrace is certainly usable for tracing CMA allocation callers and
> >>latency. ftrace is still only a fixed size buffer though so it's
> >>possible for information to be lost if other logging is enabled.
> >
> >Sorry, I don't get with only above reasons why we need this. :(
> >
> >>For most of the CMA use cases, there is a very high cost if the
> >>proper debugging information is not available so the more that
> >>can be guaranteed the better.
> >>
> >>It's also worth noting that the SLUB allocator has a sysfs
> >>interface for showing allocation callers when CONFIG_SLUB_DEBUG
> >>is enabled.
> >>
> >>Thanks,
> >>Laura
> >>
> >>--
> >>Qualcomm Innovation Center, Inc.
> >>Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> >>a Linux Foundation Collaborative Project
> >>
> >>--
> >>To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >>the body to majordomo@kvack.org.  For more info on Linux MM,
> >>see: http://www.linux-mm.org/ .
> >>Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2014-12-31  2:18         ` Minchan Kim
@ 2014-12-31  2:45           ` Gioh Kim
  0 siblings, 0 replies; 39+ messages in thread
From: Gioh Kim @ 2014-12-31  2:45 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Laura Abbott, Stefan I. Strogin, linux-mm, linux-kernel,
	Joonsoo Kim, Andrew Morton, Marek Szyprowski, Michal Nazarewicz,
	aneesh.kumar, Laurent Pinchart, Dmitry Safonov, Pintu Kumar,
	Weijie Yang, SeongJae Park, Hui Zhu, Dyasly Sergey,
	Vyacheslav Tyrtov, rostedt, namhyung



2014-12-31 i??i ? 11:18i?? Minchan Kim i?'(e??) i?' e,?:
> Hey, Gioh
>
> On Wed, Dec 31, 2014 at 09:58:04AM +0900, Gioh Kim wrote:
>>
>>
>> 2014-12-30 i??i?? 1:47i?? Minchan Kim i?'(e??) i?' e,?:
>>> On Mon, Dec 29, 2014 at 11:52:58AM -0800, Laura Abbott wrote:
>>>> On 12/28/2014 6:36 PM, Minchan Kim wrote:
>>>>> Hello,
>>>>>
>>>>> On Fri, Dec 26, 2014 at 05:39:01PM +0300, Stefan I. Strogin wrote:
>>>>>> Hello all,
>>>>>>
>>>>>> Here is a patch set that adds /proc/cmainfo.
>>>>>>
>>>>>> When compiled with CONFIG_CMA_DEBUG /proc/cmainfo will contain information
>>>>>> about about total, used, maximum free contiguous chunk and all currently
>>>>>> allocated contiguous buffers in CMA regions. The information about allocated
>>>>>> CMA buffers includes pid, comm, allocation latency and stacktrace at the
>>>>>> moment of allocation.
>>>>>
>>>>> It just says what you are doing but you didn't say why we need it.
>>>>> I can guess but clear description(ie, the problem what you want to
>>>>> solve with this patchset) would help others to review, for instance,
>>>>> why we need latency, why we need callstack, why we need new wheel
>>>>> rather than ftrace and so on.
>>>>>
>>>>> Thanks.
>>>>>
>>>>
>>>>
>>>> I've been meaning to write something like this for a while so I'm
>>>> happy to see an attempt made to fix this. I can't speak for the
>>>> author's reasons for wanting this information but there are
>>>> several reasons why I was thinking of something similar.
>>>>
>>>> The most common bug reports seen internally on CMA are 1) CMA is
>>>> too slow and 2) CMA failed to allocate memory. For #1, not all
>>>> allocations may be slow so it's useful to be able to keep track
>>>> of which allocations are taking too long. For #2, migration
>>>
>>> Then, I don't think we could keep all of allocations. What we need
>>> is only slow allocations. I hope we can do that with ftrace.
>>>
>>> ex)
>>>
>>> # cd /sys/kernel/debug/tracing
>>> # echo 1 > options/stacktrace
>>> # echo cam_alloc > set_ftrace_filter
>>> # echo your_threshold > tracing_thresh
>>>
>>> I know it doesn't work now but I think it's more flexible
>>> and general way to handle such issues(ie, latency of some functions).
>>> So, I hope we could enhance ftrace rather than new wheel.
>>> Ccing ftrace people.
>>
>> For CMA performance test or code flow check, ftrace is better.
>>
>> ex)
>> echo cma_alloc > /sys/kernel/debug/tracing/set_graph_function
>> echo function_graph > /sys/kernel/debug/tracing/current_tracer
>> echo funcgraph-proc > /sys/kernel/debug/tracing/trace_options
>> echo nosleep-time > /sys/kernel/debug/tracing/trace_options
>> echo funcgraph-tail > /sys/kernel/debug/tracing/trace_options
>> echo 1 > /sys/kernel/debug/tracing/tracing_on
>
> I didn't know such detail. Thanks for the tip, Gioh.
>
>>
>> This can trace every cam_alloc and allocation time.
>> I think ftrace is better to debug latency.
>> If a buffer had allocated and had peak latency and freed,
>> we can check it.
>
> Agree.
>
>>
>> But ftrace doesn't provide current status how many buffers we have and what address it is.
>> So I think debugging information is useful.
>
> I didn't say debug information is useless.
> If we need to know snapshot of cma at the moment,
> describe why we need it and send a patch to implement the idea
> rather than dumping lots of information is always better.

Yes, you're right.
I mean this patch is useful to me.
I sometimes need to check each drivers has buffers that are correctly located and aligned.


>
>>
>>
>>
>>>
>>> Futhermore, if we really need to have such information, we need more data
>>> (ex, how many of pages were migrated out, how many pages were dropped
>>> without migrated, how many pages were written back, how many pages were
>>> retried with the page lock and so on).
>>> In this case, event trace would be better.
>>>
>>>
>>>> failure is fairly common but it's still important to rule out
>>>> a memory leak from a dma client. Seeing all the allocations is
>>>> also very useful for memory tuning (e.g. how big does the CMA
>>>> region need to be, which clients are actually allocating memory).
>>>
>>> Memory leak is really general problem and could we handle it with
>>> page_owner?
>>>
>>>>
>>>> ftrace is certainly usable for tracing CMA allocation callers and
>>>> latency. ftrace is still only a fixed size buffer though so it's
>>>> possible for information to be lost if other logging is enabled.
>>>
>>> Sorry, I don't get with only above reasons why we need this. :(
>>>
>>>> For most of the CMA use cases, there is a very high cost if the
>>>> proper debugging information is not available so the more that
>>>> can be guaranteed the better.
>>>>
>>>> It's also worth noting that the SLUB allocator has a sysfs
>>>> interface for showing allocation callers when CONFIG_SLUB_DEBUG
>>>> is enabled.
>>>>
>>>> Thanks,
>>>> Laura
>>>>
>>>> --
>>>> Qualcomm Innovation Center, Inc.
>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>>>> a Linux Foundation Collaborative Project
>>>>
>>>> --
>>>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>>>> the body to majordomo@kvack.org.  For more info on Linux MM,
>>>> see: http://www.linux-mm.org/ .
>>>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>>
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2014-12-31  0:58       ` Gioh Kim
  2014-12-31  2:18         ` Minchan Kim
@ 2014-12-31  6:47         ` Namhyung Kim
  2014-12-31  7:32           ` Minchan Kim
  1 sibling, 1 reply; 39+ messages in thread
From: Namhyung Kim @ 2014-12-31  6:47 UTC (permalink / raw)
  To: Gioh Kim
  Cc: Minchan Kim, Laura Abbott, Stefan I. Strogin, linux-mm,
	linux-kernel, Joonsoo Kim, Andrew Morton, Marek Szyprowski,
	Michal Nazarewicz, aneesh.kumar, Laurent Pinchart, Dmitry Safonov,
	Pintu Kumar, Weijie Yang, SeongJae Park, Hui Zhu, Dyasly Sergey,
	Vyacheslav Tyrtov, rostedt

Hello,

On Wed, Dec 31, 2014 at 09:58:04AM +0900, Gioh Kim wrote:
> 2014-12-30 i??i?? 1:47i?? Minchan Kim i?'(e??) i?' e,?:
> >On Mon, Dec 29, 2014 at 11:52:58AM -0800, Laura Abbott wrote:
> >>I've been meaning to write something like this for a while so I'm
> >>happy to see an attempt made to fix this. I can't speak for the
> >>author's reasons for wanting this information but there are
> >>several reasons why I was thinking of something similar.
> >>
> >>The most common bug reports seen internally on CMA are 1) CMA is
> >>too slow and 2) CMA failed to allocate memory. For #1, not all
> >>allocations may be slow so it's useful to be able to keep track
> >>of which allocations are taking too long. For #2, migration
> >
> >Then, I don't think we could keep all of allocations. What we need
> >is only slow allocations. I hope we can do that with ftrace.
> >
> >ex)
> >
> ># cd /sys/kernel/debug/tracing
> ># echo 1 > options/stacktrace
> ># echo cam_alloc > set_ftrace_filter
> ># echo your_threshold > tracing_thresh
> >
> >I know it doesn't work now but I think it's more flexible
> >and general way to handle such issues(ie, latency of some functions).
> >So, I hope we could enhance ftrace rather than new wheel.
> >Ccing ftrace people.
> 
> For CMA performance test or code flow check, ftrace is better.
> 
> ex)
> echo cma_alloc > /sys/kernel/debug/tracing/set_graph_function
> echo function_graph > /sys/kernel/debug/tracing/current_tracer
> echo funcgraph-proc > /sys/kernel/debug/tracing/trace_options
> echo nosleep-time > /sys/kernel/debug/tracing/trace_options
> echo funcgraph-tail > /sys/kernel/debug/tracing/trace_options
> echo 1 > /sys/kernel/debug/tracing/tracing_on
> 
> This can trace every cam_alloc and allocation time.
> I think ftrace is better to debug latency.
> If a buffer had allocated and had peak latency and freed,
> we can check it.

It'd be great if we can reuse the max latency tracing feature for the
function graph tracer in order to track a latency problem of an
arbitrary function more easily.  I've written a PoC code that can be
used like below..

  # cd /sys/kernel/debug/tracing
  # echo 0 > tracing_on
  # echo function_graph > current_tracer
  # echo funcgraph-latency > trace_options
  # echo cma_alloc > graph_latency_func
  # echo 1 > tracing_on

Now the tracing_max_latency file has a max latency of the cma_alloc()
in usec and the snapshot file contains a snapshot of all the codepath
to the function at the time.

Would anybody like to play with it? :)

Thanks,
Namhyung


diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 0eddfeb05fee..4a3d5ed2802c 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -723,6 +723,7 @@ extern char trace_find_mark(unsigned long long duration);
 #define TRACE_GRAPH_PRINT_ABS_TIME      0x20
 #define TRACE_GRAPH_PRINT_IRQS          0x40
 #define TRACE_GRAPH_PRINT_TAIL          0x80
+#define TRACE_GRAPH_MAX_LATENCY         0x100
 #define TRACE_GRAPH_PRINT_FILL_SHIFT	28
 #define TRACE_GRAPH_PRINT_FILL_MASK	(0x3 << TRACE_GRAPH_PRINT_FILL_SHIFT)
 
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index ba476009e5de..7fc3e21d1354 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -8,6 +8,7 @@
  */
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
+#include <linux/module.h>
 #include <linux/ftrace.h>
 #include <linux/slab.h>
 #include <linux/fs.h>
@@ -44,6 +45,10 @@ void ftrace_graph_stop(void)
 
 /* When set, irq functions will be ignored */
 static int ftrace_graph_skip_irqs;
+/* When set, record max latency of a given function */
+static int ftrace_graph_max_latency;
+
+static unsigned long ftrace_graph_latency_func;
 
 struct fgraph_cpu_data {
 	pid_t		last_pid;
@@ -84,6 +89,8 @@ static struct tracer_opt trace_opts[] = {
 	{ TRACER_OPT(funcgraph-irqs, TRACE_GRAPH_PRINT_IRQS) },
 	/* Display function name after trailing } */
 	{ TRACER_OPT(funcgraph-tail, TRACE_GRAPH_PRINT_TAIL) },
+	/* Record max latency of a given function } */
+	{ TRACER_OPT(funcgraph-latency, TRACE_GRAPH_MAX_LATENCY) },
 	{ } /* Empty entry */
 };
 
@@ -389,6 +396,22 @@ trace_graph_function(struct trace_array *tr,
 	__trace_graph_function(tr, ip, flags, pc);
 }
 
+#ifdef CONFIG_TRACER_MAX_TRACE
+static bool report_latency(struct trace_array *tr,
+			   struct ftrace_graph_ret *trace)
+{
+	unsigned long long delta = trace->rettime - trace->calltime;
+
+	if (!ftrace_graph_max_latency)
+		return false;
+
+	if (ftrace_graph_latency_func != trace->func)
+		return false;
+
+	return tr->max_latency < delta;
+}
+#endif
+
 void __trace_graph_return(struct trace_array *tr,
 				struct ftrace_graph_ret *trace,
 				unsigned long flags,
@@ -428,6 +451,22 @@ void trace_graph_return(struct ftrace_graph_ret *trace)
 	if (likely(disabled == 1)) {
 		pc = preempt_count();
 		__trace_graph_return(tr, trace, flags, pc);
+
+#ifdef CONFIG_TRACER_MAX_TRACE
+		if (report_latency(tr, trace)) {
+			static DEFINE_RAW_SPINLOCK(max_trace_lock);
+			unsigned long long delta;
+
+			delta = trace->rettime - trace->calltime;
+
+			raw_spin_lock(&max_trace_lock);
+			if (delta > tr->max_latency) {
+				tr->max_latency = delta;
+				update_max_tr(tr, current, cpu);
+			}
+			raw_spin_unlock(&max_trace_lock);
+		}
+#endif
 	}
 	atomic_dec(&data->disabled);
 	local_irq_restore(flags);
@@ -456,6 +495,11 @@ static int graph_trace_init(struct trace_array *tr)
 	int ret;
 
 	set_graph_array(tr);
+
+#ifdef CONFIG_TRACE_MAX_LATENCY
+	graph_array->max_latency = 0;
+#endif
+
 	if (tracing_thresh)
 		ret = register_ftrace_graph(&trace_graph_thresh_return,
 					    &trace_graph_thresh_entry);
@@ -1358,7 +1402,15 @@ func_graph_set_flag(struct trace_array *tr, u32 old_flags, u32 bit, int set)
 {
 	if (bit == TRACE_GRAPH_PRINT_IRQS)
 		ftrace_graph_skip_irqs = !set;
+	else if (bit == TRACE_GRAPH_MAX_LATENCY) {
+		ftrace_graph_max_latency = set;
 
+		if (set && !tr->allocated_snapshot) {
+			int ret = tracing_alloc_snapshot();
+			if (ret < 0)
+				return ret;
+		}
+	}
 	return 0;
 }
 
@@ -1425,6 +1477,43 @@ graph_depth_read(struct file *filp, char __user *ubuf, size_t cnt,
 	return simple_read_from_buffer(ubuf, cnt, ppos, buf, n);
 }
 
+static ssize_t
+graph_latency_write(struct file *filp, const char __user *ubuf, size_t cnt,
+		    loff_t *ppos)
+{
+	char buf[KSYM_SYMBOL_LEN];
+	long ret;
+
+	ret = strncpy_from_user(buf, ubuf, cnt);
+	if (ret <= 0)
+		return ret;
+
+	if (buf[ret - 1] == '\n')
+		buf[ret - 1] = '\0';
+
+	ftrace_graph_latency_func = kallsyms_lookup_name(buf);
+	if (!ftrace_graph_latency_func)
+		return -EINVAL;
+
+	*ppos += cnt;
+
+	return cnt;
+}
+
+static ssize_t
+graph_latency_read(struct file *filp, char __user *ubuf, size_t cnt,
+		   loff_t *ppos)
+{
+	char buf[KSYM_SYMBOL_LEN];
+
+	if (!ftrace_graph_latency_func)
+		return 0;
+
+	kallsyms_lookup(ftrace_graph_latency_func, NULL, NULL, NULL, buf);
+
+	return simple_read_from_buffer(ubuf, cnt, ppos, buf, strlen(buf));
+}
+
 static const struct file_operations graph_depth_fops = {
 	.open		= tracing_open_generic,
 	.write		= graph_depth_write,
@@ -1432,6 +1521,13 @@ static const struct file_operations graph_depth_fops = {
 	.llseek		= generic_file_llseek,
 };
 
+static const struct file_operations graph_latency_fops = {
+	.open		= tracing_open_generic,
+	.write		= graph_latency_write,
+	.read		= graph_latency_read,
+	.llseek		= generic_file_llseek,
+};
+
 static __init int init_graph_debugfs(void)
 {
 	struct dentry *d_tracer;
@@ -1442,6 +1538,10 @@ static __init int init_graph_debugfs(void)
 
 	trace_create_file("max_graph_depth", 0644, d_tracer,
 			  NULL, &graph_depth_fops);
+#ifdef CONFIG_TRACER_MAX_TRACE
+	trace_create_file("graph_latency_func", 0644, d_tracer,
+			  NULL, &graph_latency_fops);
+#endif
 
 	return 0;
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2014-12-31  6:47         ` Namhyung Kim
@ 2014-12-31  7:32           ` Minchan Kim
  0 siblings, 0 replies; 39+ messages in thread
From: Minchan Kim @ 2014-12-31  7:32 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Gioh Kim, Laura Abbott, Stefan I. Strogin, linux-mm, linux-kernel,
	Joonsoo Kim, Andrew Morton, Marek Szyprowski, Michal Nazarewicz,
	aneesh.kumar, Laurent Pinchart, Dmitry Safonov, Pintu Kumar,
	Weijie Yang, SeongJae Park, Hui Zhu, Dyasly Sergey,
	Vyacheslav Tyrtov, rostedt

On Wed, Dec 31, 2014 at 03:47:59PM +0900, Namhyung Kim wrote:
> Hello,
> 
> On Wed, Dec 31, 2014 at 09:58:04AM +0900, Gioh Kim wrote:
> > 2014-12-30 i??i?? 1:47i?? Minchan Kim i?'(e??) i?' e,?:
> > >On Mon, Dec 29, 2014 at 11:52:58AM -0800, Laura Abbott wrote:
> > >>I've been meaning to write something like this for a while so I'm
> > >>happy to see an attempt made to fix this. I can't speak for the
> > >>author's reasons for wanting this information but there are
> > >>several reasons why I was thinking of something similar.
> > >>
> > >>The most common bug reports seen internally on CMA are 1) CMA is
> > >>too slow and 2) CMA failed to allocate memory. For #1, not all
> > >>allocations may be slow so it's useful to be able to keep track
> > >>of which allocations are taking too long. For #2, migration
> > >
> > >Then, I don't think we could keep all of allocations. What we need
> > >is only slow allocations. I hope we can do that with ftrace.
> > >
> > >ex)
> > >
> > ># cd /sys/kernel/debug/tracing
> > ># echo 1 > options/stacktrace
> > ># echo cam_alloc > set_ftrace_filter
> > ># echo your_threshold > tracing_thresh
> > >
> > >I know it doesn't work now but I think it's more flexible
> > >and general way to handle such issues(ie, latency of some functions).
> > >So, I hope we could enhance ftrace rather than new wheel.
> > >Ccing ftrace people.
> > 
> > For CMA performance test or code flow check, ftrace is better.
> > 
> > ex)
> > echo cma_alloc > /sys/kernel/debug/tracing/set_graph_function
> > echo function_graph > /sys/kernel/debug/tracing/current_tracer
> > echo funcgraph-proc > /sys/kernel/debug/tracing/trace_options
> > echo nosleep-time > /sys/kernel/debug/tracing/trace_options
> > echo funcgraph-tail > /sys/kernel/debug/tracing/trace_options
> > echo 1 > /sys/kernel/debug/tracing/tracing_on
> > 
> > This can trace every cam_alloc and allocation time.
> > I think ftrace is better to debug latency.
> > If a buffer had allocated and had peak latency and freed,
> > we can check it.
> 
> It'd be great if we can reuse the max latency tracing feature for the
> function graph tracer in order to track a latency problem of an
> arbitrary function more easily.  I've written a PoC code that can be
> used like below..
> 
>   # cd /sys/kernel/debug/tracing
>   # echo 0 > tracing_on
>   # echo function_graph > current_tracer
>   # echo funcgraph-latency > trace_options
>   # echo cma_alloc > graph_latency_func
>   # echo 1 > tracing_on
> 
> Now the tracing_max_latency file has a max latency of the cma_alloc()
> in usec and the snapshot file contains a snapshot of all the codepath
> to the function at the time.
> 
> Would anybody like to play with it? :)

Thanks, Namhyung. I did and feel it would be useful to check only
max latency data.

Anyway, off-topic:
IMO, it would be very useful to check latency of several functions which
has different threshold at the same time without helping other tools.

> 
> Thanks,
> Namhyung
> 
> 
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 0eddfeb05fee..4a3d5ed2802c 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -723,6 +723,7 @@ extern char trace_find_mark(unsigned long long duration);
>  #define TRACE_GRAPH_PRINT_ABS_TIME      0x20
>  #define TRACE_GRAPH_PRINT_IRQS          0x40
>  #define TRACE_GRAPH_PRINT_TAIL          0x80
> +#define TRACE_GRAPH_MAX_LATENCY         0x100
>  #define TRACE_GRAPH_PRINT_FILL_SHIFT	28
>  #define TRACE_GRAPH_PRINT_FILL_MASK	(0x3 << TRACE_GRAPH_PRINT_FILL_SHIFT)
>  
> diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
> index ba476009e5de..7fc3e21d1354 100644
> --- a/kernel/trace/trace_functions_graph.c
> +++ b/kernel/trace/trace_functions_graph.c
> @@ -8,6 +8,7 @@
>   */
>  #include <linux/debugfs.h>
>  #include <linux/uaccess.h>
> +#include <linux/module.h>
>  #include <linux/ftrace.h>
>  #include <linux/slab.h>
>  #include <linux/fs.h>
> @@ -44,6 +45,10 @@ void ftrace_graph_stop(void)
>  
>  /* When set, irq functions will be ignored */
>  static int ftrace_graph_skip_irqs;
> +/* When set, record max latency of a given function */
> +static int ftrace_graph_max_latency;
> +
> +static unsigned long ftrace_graph_latency_func;
>  
>  struct fgraph_cpu_data {
>  	pid_t		last_pid;
> @@ -84,6 +89,8 @@ static struct tracer_opt trace_opts[] = {
>  	{ TRACER_OPT(funcgraph-irqs, TRACE_GRAPH_PRINT_IRQS) },
>  	/* Display function name after trailing } */
>  	{ TRACER_OPT(funcgraph-tail, TRACE_GRAPH_PRINT_TAIL) },
> +	/* Record max latency of a given function } */
> +	{ TRACER_OPT(funcgraph-latency, TRACE_GRAPH_MAX_LATENCY) },
>  	{ } /* Empty entry */
>  };
>  
> @@ -389,6 +396,22 @@ trace_graph_function(struct trace_array *tr,
>  	__trace_graph_function(tr, ip, flags, pc);
>  }
>  
> +#ifdef CONFIG_TRACER_MAX_TRACE
> +static bool report_latency(struct trace_array *tr,
> +			   struct ftrace_graph_ret *trace)
> +{
> +	unsigned long long delta = trace->rettime - trace->calltime;
> +
> +	if (!ftrace_graph_max_latency)
> +		return false;
> +
> +	if (ftrace_graph_latency_func != trace->func)
> +		return false;
> +
> +	return tr->max_latency < delta;
> +}
> +#endif
> +
>  void __trace_graph_return(struct trace_array *tr,
>  				struct ftrace_graph_ret *trace,
>  				unsigned long flags,
> @@ -428,6 +451,22 @@ void trace_graph_return(struct ftrace_graph_ret *trace)
>  	if (likely(disabled == 1)) {
>  		pc = preempt_count();
>  		__trace_graph_return(tr, trace, flags, pc);
> +
> +#ifdef CONFIG_TRACER_MAX_TRACE
> +		if (report_latency(tr, trace)) {
> +			static DEFINE_RAW_SPINLOCK(max_trace_lock);
> +			unsigned long long delta;
> +
> +			delta = trace->rettime - trace->calltime;
> +
> +			raw_spin_lock(&max_trace_lock);
> +			if (delta > tr->max_latency) {
> +				tr->max_latency = delta;
> +				update_max_tr(tr, current, cpu);
> +			}
> +			raw_spin_unlock(&max_trace_lock);
> +		}
> +#endif
>  	}
>  	atomic_dec(&data->disabled);
>  	local_irq_restore(flags);
> @@ -456,6 +495,11 @@ static int graph_trace_init(struct trace_array *tr)
>  	int ret;
>  
>  	set_graph_array(tr);
> +
> +#ifdef CONFIG_TRACE_MAX_LATENCY
> +	graph_array->max_latency = 0;
> +#endif
> +
>  	if (tracing_thresh)
>  		ret = register_ftrace_graph(&trace_graph_thresh_return,
>  					    &trace_graph_thresh_entry);
> @@ -1358,7 +1402,15 @@ func_graph_set_flag(struct trace_array *tr, u32 old_flags, u32 bit, int set)
>  {
>  	if (bit == TRACE_GRAPH_PRINT_IRQS)
>  		ftrace_graph_skip_irqs = !set;
> +	else if (bit == TRACE_GRAPH_MAX_LATENCY) {
> +		ftrace_graph_max_latency = set;
>  
> +		if (set && !tr->allocated_snapshot) {
> +			int ret = tracing_alloc_snapshot();
> +			if (ret < 0)
> +				return ret;
> +		}
> +	}
>  	return 0;
>  }
>  
> @@ -1425,6 +1477,43 @@ graph_depth_read(struct file *filp, char __user *ubuf, size_t cnt,
>  	return simple_read_from_buffer(ubuf, cnt, ppos, buf, n);
>  }
>  
> +static ssize_t
> +graph_latency_write(struct file *filp, const char __user *ubuf, size_t cnt,
> +		    loff_t *ppos)
> +{
> +	char buf[KSYM_SYMBOL_LEN];
> +	long ret;
> +
> +	ret = strncpy_from_user(buf, ubuf, cnt);
> +	if (ret <= 0)
> +		return ret;
> +
> +	if (buf[ret - 1] == '\n')
> +		buf[ret - 1] = '\0';
> +
> +	ftrace_graph_latency_func = kallsyms_lookup_name(buf);
> +	if (!ftrace_graph_latency_func)
> +		return -EINVAL;
> +
> +	*ppos += cnt;
> +
> +	return cnt;
> +}
> +
> +static ssize_t
> +graph_latency_read(struct file *filp, char __user *ubuf, size_t cnt,
> +		   loff_t *ppos)
> +{
> +	char buf[KSYM_SYMBOL_LEN];
> +
> +	if (!ftrace_graph_latency_func)
> +		return 0;
> +
> +	kallsyms_lookup(ftrace_graph_latency_func, NULL, NULL, NULL, buf);
> +
> +	return simple_read_from_buffer(ubuf, cnt, ppos, buf, strlen(buf));
> +}
> +
>  static const struct file_operations graph_depth_fops = {
>  	.open		= tracing_open_generic,
>  	.write		= graph_depth_write,
> @@ -1432,6 +1521,13 @@ static const struct file_operations graph_depth_fops = {
>  	.llseek		= generic_file_llseek,
>  };
>  
> +static const struct file_operations graph_latency_fops = {
> +	.open		= tracing_open_generic,
> +	.write		= graph_latency_write,
> +	.read		= graph_latency_read,
> +	.llseek		= generic_file_llseek,
> +};
> +
>  static __init int init_graph_debugfs(void)
>  {
>  	struct dentry *d_tracer;
> @@ -1442,6 +1538,10 @@ static __init int init_graph_debugfs(void)
>  
>  	trace_create_file("max_graph_depth", 0644, d_tracer,
>  			  NULL, &graph_depth_fops);
> +#ifdef CONFIG_TRACER_MAX_TRACE
> +	trace_create_file("graph_latency_func", 0644, d_tracer,
> +			  NULL, &graph_latency_fops);
> +#endif
>  
>  	return 0;
>  }
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2014-12-30  4:47     ` Minchan Kim
  2014-12-30 22:00       ` Laura Abbott
  2014-12-31  0:58       ` Gioh Kim
@ 2015-01-09 14:19       ` Steven Rostedt
  2015-01-09 14:35         ` Steven Rostedt
  2015-01-13  2:27         ` Minchan Kim
  2 siblings, 2 replies; 39+ messages in thread
From: Steven Rostedt @ 2015-01-09 14:19 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Laura Abbott, Stefan I. Strogin, linux-mm, linux-kernel,
	Joonsoo Kim, Andrew Morton, Marek Szyprowski, Michal Nazarewicz,
	aneesh.kumar, Laurent Pinchart, Dmitry Safonov, Pintu Kumar,
	Weijie Yang, SeongJae Park, Hui Zhu, Dyasly Sergey,
	Vyacheslav Tyrtov, namhyung


Wow, too much work over the holidays ;-)


On Tue, 30 Dec 2014 13:47:26 +0900
Minchan Kim <minchan@kernel.org> wrote:


> Then, I don't think we could keep all of allocations. What we need
> is only slow allocations. I hope we can do that with ftrace.
> 
> ex)
> 
> # cd /sys/kernel/debug/tracing
> # echo 1 > options/stacktrace
> # echo cam_alloc > set_ftrace_filter
> # echo your_threshold > tracing_thresh
> 
> I know it doesn't work now but I think it's more flexible
> and general way to handle such issues(ie, latency of some functions).
> So, I hope we could enhance ftrace rather than new wheel.
> Ccing ftrace people.
> 

I've been working on trace-cmd this month and came up with a new
"profile" command. I don't have cma_alloc but doing something like this
with kmalloc.


# trace-cmd profile -S -p function_graph -l __kmalloc -l '__kmalloc:stacktrace' --stderr workload 2>profile.out

and this gives me in profile.out, something like this:

------
CPU: 0
entries: 0
overrun: 0
commit overrun: 0
bytes: 3560
oldest event ts:   349.925480
now ts:   356.910819
dropped events: 0
read events: 36

CPU: 1
entries: 0
overrun: 0
commit overrun: 0
bytes: 408
oldest event ts:   354.610624
now ts:   356.910838
dropped events: 0
read events: 48

CPU: 2
entries: 0
overrun: 0
commit overrun: 0
bytes: 3184
oldest event ts:   356.761870
now ts:   356.910854
dropped events: 0
read events: 1830

CPU: 3
entries: 6
overrun: 0
commit overrun: 0
bytes: 2664
oldest event ts:   356.440675
now ts:   356.910875
dropped events: 0
read events: 717

[...]

task: <...>-2880
  Event: func: __kmalloc() (74) Total: 53254 Avg: 719 Max: 1095 Min:481
          | 
          + ftrace_ops_list_func (0xffffffff810c229e)
              100% (74) time:53254 max:1095 min:481 avg:719
               ftrace_call (0xffffffff81526047)
               trace_preempt_on (0xffffffff810d28ff)
               preempt_count_sub (0xffffffff81061c62)
               __mutex_lock_slowpath (0xffffffff81522807)
               __kmalloc (0xffffffff811323f3)
               __kmalloc (0xffffffff811323f3)
               tracing_buffers_splice_read (0xffffffff810ca23e)
                | 
                + set_next_entity (0xffffffff81067027)
                |   66% (49) time:34925 max:1044 min:481 avg:712
                |    __switch_to (0xffffffff810016d7)
                |    trace_hardirqs_on (0xffffffff810d28db)
                |    _raw_spin_unlock_irq (0xffffffff81523a8e)
                |    trace_preempt_on (0xffffffff810d28ff)
                |    preempt_count_sub (0xffffffff81061c62)
                |    __schedule (0xffffffff815204d3)
                |    trace_preempt_on (0xffffffff810d28ff)
                |    buffer_spd_release (0xffffffff810c91fd)
                |    SyS_splice (0xffffffff8115dccf)
                |    system_call_fastpath (0xffffffff81523f92)
                | 
                + do_read_fault.isra.74 (0xffffffff8111431d)
                |   24% (18) time:12654 max:1008 min:481 avg:703
                |     | 
                |     + select_task_rq_fair (0xffffffff81067806)
                |     |   89% (16) time:11234 max:1008 min:481 avg:702
                |     |    trace_preempt_on (0xffffffff810d28ff)
                |     |    buffer_spd_release (0xffffffff810c91fd)
                |     |    SyS_splice (0xffffffff8115dccf)
                |     |    system_call_fastpath (0xffffffff81523f92)
                |     | 
                |     + handle_mm_fault (0xffffffff81114df4)
                |         11% (2) time:1420 max:879 min:541 avg:710
                |          trace_preempt_on (0xffffffff810d28ff)
                |          buffer_spd_release (0xffffffff810c91fd)
                |          SyS_splice (0xffffffff8115dccf)
                |          system_call_fastpath (0xffffffff81523f92)
                |       
                | 
                | 
                + update_stats_wait_end (0xffffffff81066c5c)
                |   6% (4) time:3153 max:1095 min:635 avg:788
                |    set_next_entity (0xffffffff81067027)
                |    __switch_to (0xffffffff810016d7)
                |    trace_hardirqs_on (0xffffffff810d28db)
                |    _raw_spin_unlock_irq (0xffffffff81523a8e)
                |    trace_preempt_on (0xffffffff810d28ff)
                |    preempt_count_sub (0xffffffff81061c62)
                |    __schedule (0xffffffff815204d3)
                |    trace_preempt_on (0xffffffff810d28ff)
                |    buffer_spd_release (0xffffffff810c91fd)
                |    SyS_splice (0xffffffff8115dccf)
                |    system_call_fastpath (0xffffffff81523f92)
                | 
                + _raw_spin_unlock (0xffffffff81523af5)
                |   3% (2) time:1854 max:936 min:918 avg:927
                |    do_read_fault.isra.74 (0xffffffff8111431d)
                |    handle_mm_fault (0xffffffff81114df4)
                |    buffer_spd_release (0xffffffff810c91fd)
                |    SyS_splice (0xffffffff8115dccf)
                |    system_call_fastpath (0xffffffff81523f92)
                | 
                + trace_hardirqs_off (0xffffffff810d2891)
                    1% (1) time:668 max:668 min:668 avg:668
                     kmem_cache_free (0xffffffff81130e48)
                     __dequeue_signal (0xffffffff8104c802)
                     trace_preempt_on (0xffffffff810d28ff)
                     preempt_count_sub (0xffffffff81061c62)
                     _raw_spin_unlock_irq (0xffffffff81523a8e)
                     recalc_sigpending (0xffffffff8104c5d1)
                     __set_task_blocked (0xffffffff8104cd2e)
                     trace_preempt_on (0xffffffff810d28ff)
                     preempt_count_sub (0xffffffff81061c62)
                     preempt_count_sub (0xffffffff81061c62)
                     buffer_spd_release (0xffffffff810c91fd)
                     SyS_splice (0xffffffff8115dccf)
                     system_call_fastpath (0xffffffff81523f92)
                  
If you want better names, I would add "-e sched_switch", as that will
record the comms of the tasks and you don't end up with a bunch of
"<...>".

Is this something you are looking for. The profile command does not
save to disk, thus it does the analysis live, and you don't need to
worry about running out of disk space. Although, since it is live, it
may tend to drop more events (see the "overrun values").

You can get trace-cmd from:

git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git

You'll need the latest from the master branch, as even 2.5 doesn't have
the --stderr yet.

Make sure to do a make install and make install_doc, then you can do:

 man trace-cmd-record
 man trace-cmd-profile

to read about all the options.

-- Steve

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2015-01-09 14:19       ` Steven Rostedt
@ 2015-01-09 14:35         ` Steven Rostedt
  2015-01-13  2:27         ` Minchan Kim
  1 sibling, 0 replies; 39+ messages in thread
From: Steven Rostedt @ 2015-01-09 14:35 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Laura Abbott, Stefan I. Strogin, linux-mm, linux-kernel,
	Joonsoo Kim, Andrew Morton, Marek Szyprowski, Michal Nazarewicz,
	aneesh.kumar, Laurent Pinchart, Dmitry Safonov, Pintu Kumar,
	Weijie Yang, SeongJae Park, Hui Zhu, Dyasly Sergey,
	Vyacheslav Tyrtov, namhyung

On Fri, 9 Jan 2015 09:19:04 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> task: <...>-2880
>   Event: func: __kmalloc() (74) Total: 53254 Avg: 719 Max: 1095 Min:481

I forgot to mention that all times are in nanoseconds (or whatever the
trace clock is set at).

>           | 
>           + ftrace_ops_list_func (0xffffffff810c229e)
>               100% (74) time:53254 max:1095 min:481 avg:719
>                ftrace_call (0xffffffff81526047)
>                trace_preempt_on (0xffffffff810d28ff)
>                preempt_count_sub (0xffffffff81061c62)
>                __mutex_lock_slowpath (0xffffffff81522807)
>                __kmalloc (0xffffffff811323f3)
>                __kmalloc (0xffffffff811323f3)

The above may be a bit confusing, as the stack trace included more than
it should have (it's variable and hard to get right).
ftrace_ops_list_func() did not call kmalloc, but it did call the
stack trace and was included. You want to look below to find the
interesting data.

This is still a new feature, and is using some of the kernel tracing
more than it has been in the past. There's still a few eggs that need
to be boiled here.


>                tracing_buffers_splice_read (0xffffffff810ca23e)

All the kmallocs for this task was called by
tracing_buffers_splice_read() (hmm, I chose to show you the trace-cmd
profile on itself. If I had included "-F -c" (follow workload only)  or
-e sched_switch I would have known which task to look at).

>                 | 
>                 + set_next_entity (0xffffffff81067027)
>                 |   66% (49) time:34925 max:1044 min:481 avg:712
>                 |    __switch_to (0xffffffff810016d7)
>                 |    trace_hardirqs_on (0xffffffff810d28db)
>                 |    _raw_spin_unlock_irq (0xffffffff81523a8e)
>                 |    trace_preempt_on (0xffffffff810d28ff)
>                 |    preempt_count_sub (0xffffffff81061c62)
>                 |    __schedule (0xffffffff815204d3)
>                 |    trace_preempt_on (0xffffffff810d28ff)
>                 |    buffer_spd_release (0xffffffff810c91fd)
>                 |    SyS_splice (0xffffffff8115dccf)
>                 |    system_call_fastpath (0xffffffff81523f92)
>                 | 
>                 + do_read_fault.isra.74 (0xffffffff8111431d)

I'm not sure how much I trust this. I don't have FRAME_POINTERS
enabled, so the stack traces may not be as accurate.

But you get the idea, and this can show you where the slow paths lie.

-- Steve


>                 |   24% (18) time:12654 max:1008 min:481 avg:703
>                 |     | 
>                 |     + select_task_rq_fair (0xffffffff81067806)
>                 |     |   89% (16) time:11234 max:1008 min:481 avg:702
>                 |     |    trace_preempt_on (0xffffffff810d28ff)
>                 |     |    buffer_spd_release (0xffffffff810c91fd)
>                 |     |    SyS_splice (0xffffffff8115dccf)
>                 |     |    system_call_fastpath (0xffffffff81523f92)
>                 |     | 
>                 |     + handle_mm_fault (0xffffffff81114df4)
>                 |         11% (2) time:1420 max:879 min:541 avg:710
>                 |          trace_preempt_on (0xffffffff810d28ff)
>                 |          buffer_spd_release (0xffffffff810c91fd)
>                 |          SyS_splice (0xffffffff8115dccf)
>                 |          system_call_fastpath (0xffffffff81523f92)
>                 |       
>                 | 
>                 | 
>                 + update_stats_wait_end (0xffffffff81066c5c)
>                 |   6% (4) time:3153 max:1095 min:635 avg:788
>                 |    set_next_entity (0xffffffff81067027)
>                 |    __switch_to (0xffffffff810016d7)
>                 |    trace_hardirqs_on (0xffffffff810d28db)
>                 |    _raw_spin_unlock_irq (0xffffffff81523a8e)
>                 |    trace_preempt_on (0xffffffff810d28ff)
>                 |    preempt_count_sub (0xffffffff81061c62)
>                 |    __schedule (0xffffffff815204d3)
>                 |    trace_preempt_on (0xffffffff810d28ff)
>                 |    buffer_spd_release (0xffffffff810c91fd)
>                 |    SyS_splice (0xffffffff8115dccf)
>                 |    system_call_fastpath (0xffffffff81523f92)
>                 | 
>                 + _raw_spin_unlock (0xffffffff81523af5)
>                 |   3% (2) time:1854 max:936 min:918 avg:927
>                 |    do_read_fault.isra.74 (0xffffffff8111431d)
>                 |    handle_mm_fault (0xffffffff81114df4)
>                 |    buffer_spd_release (0xffffffff810c91fd)
>                 |    SyS_splice (0xffffffff8115dccf)
>                 |    system_call_fastpath (0xffffffff81523f92)
>                 | 
>                 + trace_hardirqs_off (0xffffffff810d2891)
>                     1% (1) time:668 max:668 min:668 avg:668
>                      kmem_cache_free (0xffffffff81130e48)
>                      __dequeue_signal (0xffffffff8104c802)
>                      trace_preempt_on (0xffffffff810d28ff)
>                      preempt_count_sub (0xffffffff81061c62)
>                      _raw_spin_unlock_irq (0xffffffff81523a8e)
>                      recalc_sigpending (0xffffffff8104c5d1)
>                      __set_task_blocked (0xffffffff8104cd2e)
>                      trace_preempt_on (0xffffffff810d28ff)
>                      preempt_count_sub (0xffffffff81061c62)
>                      preempt_count_sub (0xffffffff81061c62)
>                      buffer_spd_release (0xffffffff810c91fd)
>                      SyS_splice (0xffffffff8115dccf)
>                      system_call_fastpath (0xffffffff81523f92)
>                   

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2015-01-09 14:19       ` Steven Rostedt
  2015-01-09 14:35         ` Steven Rostedt
@ 2015-01-13  2:27         ` Minchan Kim
  1 sibling, 0 replies; 39+ messages in thread
From: Minchan Kim @ 2015-01-13  2:27 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Laura Abbott, Stefan I. Strogin, linux-mm, linux-kernel,
	Joonsoo Kim, Andrew Morton, Marek Szyprowski, Michal Nazarewicz,
	aneesh.kumar, Laurent Pinchart, Dmitry Safonov, Pintu Kumar,
	Weijie Yang, SeongJae Park, Hui Zhu, Dyasly Sergey,
	Vyacheslav Tyrtov, namhyung

Hello, Steven,

On Fri, Jan 09, 2015 at 09:19:04AM -0500, Steven Rostedt wrote:
> 
> Wow, too much work over the holidays ;-)

Pretend to be diligent.

> 
> 
> On Tue, 30 Dec 2014 13:47:26 +0900
> Minchan Kim <minchan@kernel.org> wrote:
> 
> 
> > Then, I don't think we could keep all of allocations. What we need
> > is only slow allocations. I hope we can do that with ftrace.
> > 
> > ex)
> > 
> > # cd /sys/kernel/debug/tracing
> > # echo 1 > options/stacktrace
> > # echo cam_alloc > set_ftrace_filter
> > # echo your_threshold > tracing_thresh
> > 
> > I know it doesn't work now but I think it's more flexible
> > and general way to handle such issues(ie, latency of some functions).
> > So, I hope we could enhance ftrace rather than new wheel.
> > Ccing ftrace people.
> > 
> 
> I've been working on trace-cmd this month and came up with a new
> "profile" command. I don't have cma_alloc but doing something like this
> with kmalloc.
> 
> 
> # trace-cmd profile -S -p function_graph -l __kmalloc -l '__kmalloc:stacktrace' --stderr workload 2>profile.out
> 
> and this gives me in profile.out, something like this:
> 
> ------
> CPU: 0
> entries: 0
> overrun: 0
> commit overrun: 0
> bytes: 3560
> oldest event ts:   349.925480
> now ts:   356.910819
> dropped events: 0
> read events: 36
> 
> CPU: 1
> entries: 0
> overrun: 0
> commit overrun: 0
> bytes: 408
> oldest event ts:   354.610624
> now ts:   356.910838
> dropped events: 0
> read events: 48
> 
> CPU: 2
> entries: 0
> overrun: 0
> commit overrun: 0
> bytes: 3184
> oldest event ts:   356.761870
> now ts:   356.910854
> dropped events: 0
> read events: 1830
> 
> CPU: 3
> entries: 6
> overrun: 0
> commit overrun: 0
> bytes: 2664
> oldest event ts:   356.440675
> now ts:   356.910875
> dropped events: 0
> read events: 717
> 
> [...]
> 
> task: <...>-2880
>   Event: func: __kmalloc() (74) Total: 53254 Avg: 719 Max: 1095 Min:481
>           | 
>           + ftrace_ops_list_func (0xffffffff810c229e)
>               100% (74) time:53254 max:1095 min:481 avg:719
>                ftrace_call (0xffffffff81526047)
>                trace_preempt_on (0xffffffff810d28ff)
>                preempt_count_sub (0xffffffff81061c62)
>                __mutex_lock_slowpath (0xffffffff81522807)
>                __kmalloc (0xffffffff811323f3)
>                __kmalloc (0xffffffff811323f3)
>                tracing_buffers_splice_read (0xffffffff810ca23e)
>                 | 
>                 + set_next_entity (0xffffffff81067027)
>                 |   66% (49) time:34925 max:1044 min:481 avg:712
>                 |    __switch_to (0xffffffff810016d7)
>                 |    trace_hardirqs_on (0xffffffff810d28db)
>                 |    _raw_spin_unlock_irq (0xffffffff81523a8e)
>                 |    trace_preempt_on (0xffffffff810d28ff)
>                 |    preempt_count_sub (0xffffffff81061c62)
>                 |    __schedule (0xffffffff815204d3)
>                 |    trace_preempt_on (0xffffffff810d28ff)
>                 |    buffer_spd_release (0xffffffff810c91fd)
>                 |    SyS_splice (0xffffffff8115dccf)
>                 |    system_call_fastpath (0xffffffff81523f92)
>                 | 
>                 + do_read_fault.isra.74 (0xffffffff8111431d)
>                 |   24% (18) time:12654 max:1008 min:481 avg:703
>                 |     | 
>                 |     + select_task_rq_fair (0xffffffff81067806)
>                 |     |   89% (16) time:11234 max:1008 min:481 avg:702
>                 |     |    trace_preempt_on (0xffffffff810d28ff)
>                 |     |    buffer_spd_release (0xffffffff810c91fd)
>                 |     |    SyS_splice (0xffffffff8115dccf)
>                 |     |    system_call_fastpath (0xffffffff81523f92)
>                 |     | 
>                 |     + handle_mm_fault (0xffffffff81114df4)
>                 |         11% (2) time:1420 max:879 min:541 avg:710
>                 |          trace_preempt_on (0xffffffff810d28ff)
>                 |          buffer_spd_release (0xffffffff810c91fd)
>                 |          SyS_splice (0xffffffff8115dccf)
>                 |          system_call_fastpath (0xffffffff81523f92)
>                 |       
>                 | 
>                 | 
>                 + update_stats_wait_end (0xffffffff81066c5c)
>                 |   6% (4) time:3153 max:1095 min:635 avg:788
>                 |    set_next_entity (0xffffffff81067027)
>                 |    __switch_to (0xffffffff810016d7)
>                 |    trace_hardirqs_on (0xffffffff810d28db)
>                 |    _raw_spin_unlock_irq (0xffffffff81523a8e)
>                 |    trace_preempt_on (0xffffffff810d28ff)
>                 |    preempt_count_sub (0xffffffff81061c62)
>                 |    __schedule (0xffffffff815204d3)
>                 |    trace_preempt_on (0xffffffff810d28ff)
>                 |    buffer_spd_release (0xffffffff810c91fd)
>                 |    SyS_splice (0xffffffff8115dccf)
>                 |    system_call_fastpath (0xffffffff81523f92)
>                 | 
>                 + _raw_spin_unlock (0xffffffff81523af5)
>                 |   3% (2) time:1854 max:936 min:918 avg:927
>                 |    do_read_fault.isra.74 (0xffffffff8111431d)
>                 |    handle_mm_fault (0xffffffff81114df4)
>                 |    buffer_spd_release (0xffffffff810c91fd)
>                 |    SyS_splice (0xffffffff8115dccf)
>                 |    system_call_fastpath (0xffffffff81523f92)
>                 | 
>                 + trace_hardirqs_off (0xffffffff810d2891)
>                     1% (1) time:668 max:668 min:668 avg:668
>                      kmem_cache_free (0xffffffff81130e48)
>                      __dequeue_signal (0xffffffff8104c802)
>                      trace_preempt_on (0xffffffff810d28ff)
>                      preempt_count_sub (0xffffffff81061c62)
>                      _raw_spin_unlock_irq (0xffffffff81523a8e)
>                      recalc_sigpending (0xffffffff8104c5d1)
>                      __set_task_blocked (0xffffffff8104cd2e)
>                      trace_preempt_on (0xffffffff810d28ff)
>                      preempt_count_sub (0xffffffff81061c62)
>                      preempt_count_sub (0xffffffff81061c62)
>                      buffer_spd_release (0xffffffff810c91fd)
>                      SyS_splice (0xffffffff8115dccf)
>                      system_call_fastpath (0xffffffff81523f92)
>                   

Looks great!

> If you want better names, I would add "-e sched_switch", as that will
> record the comms of the tasks and you don't end up with a bunch of
> "<...>".

Good tip.

> 
> Is this something you are looking for. The profile command does not
> save to disk, thus it does the analysis live, and you don't need to
> worry about running out of disk space. Although, since it is live, it
> may tend to drop more events (see the "overrun values").
> 
> You can get trace-cmd from:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git
> 
> You'll need the latest from the master branch, as even 2.5 doesn't have
> the --stderr yet.
> 
> Make sure to do a make install and make install_doc, then you can do:
> 
>  man trace-cmd-record
>  man trace-cmd-profile
> 
> to read about all the options.

Thansk for giving me a chance to use great tool!

> 
> -- Steve
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2014-12-29  2:36 ` [PATCH 0/3] mm: cma: /proc/cmainfo Minchan Kim
  2014-12-29 19:52   ` Laura Abbott
@ 2015-01-02  5:11   ` Pavel Machek
  2015-01-22 15:44     ` Stefan Strogin
  1 sibling, 1 reply; 39+ messages in thread
From: Pavel Machek @ 2015-01-02  5:11 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Stefan I. Strogin, linux-mm, linux-kernel, Joonsoo Kim,
	Andrew Morton, Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Dmitry Safonov, Pintu Kumar, Weijie Yang,
	Laura Abbott, SeongJae Park, Hui Zhu, Dyasly Sergey,
	Vyacheslav Tyrtov

On Mon 2014-12-29 11:36:39, Minchan Kim wrote:
> Hello,
> 
> On Fri, Dec 26, 2014 at 05:39:01PM +0300, Stefan I. Strogin wrote:
> > Hello all,
> > 
> > Here is a patch set that adds /proc/cmainfo.
> > 
> > When compiled with CONFIG_CMA_DEBUG /proc/cmainfo will contain information
> > about about total, used, maximum free contiguous chunk and all currently
> > allocated contiguous buffers in CMA regions. The information about allocated
> > CMA buffers includes pid, comm, allocation latency and stacktrace at the
> > moment of allocation.

We should not add new non-process related files in
/proc. So... NAK. Should this go to debugfs instead?

> It just says what you are doing but you didn't say why we need it.
> I can guess but clear description(ie, the problem what you want to
> solve with this patchset) would help others to review, for instance,
> why we need latency, why we need callstack, why we need new wheel
> rather than ftrace and so on.
> 
> Thanks.
> 
> > 
> > Example:
> > 
> > # cat /proc/cmainfo 
> > CMARegion stat:    65536 kB total,      248 kB used,    65216 kB max contiguous chunk


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/3] mm: cma: /proc/cmainfo
  2015-01-02  5:11   ` Pavel Machek
@ 2015-01-22 15:44     ` Stefan Strogin
  0 siblings, 0 replies; 39+ messages in thread
From: Stefan Strogin @ 2015-01-22 15:44 UTC (permalink / raw)
  To: Pavel Machek, Minchan Kim
  Cc: linux-mm, linux-kernel, Joonsoo Kim, Andrew Morton,
	Marek Szyprowski, Michal Nazarewicz, aneesh.kumar,
	Laurent Pinchart, Dmitry Safonov, Pintu Kumar, Weijie Yang,
	Laura Abbott, SeongJae Park, Hui Zhu, Dyasly Sergey,
	Vyacheslav Tyrtov, s.strogin

Hello Pavel,

On 02/01/15 08:11, Pavel Machek wrote:
> On Mon 2014-12-29 11:36:39, Minchan Kim wrote:
>> Hello,
>>
>> On Fri, Dec 26, 2014 at 05:39:01PM +0300, Stefan I. Strogin wrote:
>>> Hello all,
>>>
>>> Here is a patch set that adds /proc/cmainfo.
>>>
>>> When compiled with CONFIG_CMA_DEBUG /proc/cmainfo will contain information
>>> about about total, used, maximum free contiguous chunk and all currently
>>> allocated contiguous buffers in CMA regions. The information about allocated
>>> CMA buffers includes pid, comm, allocation latency and stacktrace at the
>>> moment of allocation.
> We should not add new non-process related files in
> /proc. So... NAK. Should this go to debugfs instead?

As you say, I'll move it to debugfs and also split it by CMA region.
Something like: /sys/kernel/debug/cma/*/allocated
Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2015-01-23 12:32 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-26 14:39 [PATCH 0/3] mm: cma: /proc/cmainfo Stefan I. Strogin
2014-12-26 14:39 ` [PATCH 1/3] stacktrace: add seq_print_stack_trace() Stefan I. Strogin
2014-12-27  7:04   ` SeongJae Park
2014-12-26 14:39 ` [PATCH 2/3] mm: cma: introduce /proc/cmainfo Stefan I. Strogin
2014-12-26 16:02   ` Michal Nazarewicz
2014-12-29 14:09     ` Stefan Strogin
2014-12-29 17:26       ` Michal Nazarewicz
2014-12-31  1:14       ` Gioh Kim
2015-01-23 12:32         ` Stefan Strogin
2014-12-29 21:11   ` Laura Abbott
2015-01-21 14:18     ` Stefan Strogin
2014-12-30  4:38   ` Joonsoo Kim
2015-01-22 15:35     ` Stefan Strogin
2015-01-23  6:35       ` Joonsoo Kim
2014-12-26 14:39 ` [PATCH 3/3] cma: add functions to get region pages counters Stefan I. Strogin
2014-12-26 16:10   ` Michal Nazarewicz
2014-12-27  7:18   ` SeongJae Park
2014-12-29  5:56     ` Safonov Dmitry
2014-12-29 14:12       ` Stefan Strogin
2014-12-30  2:26   ` Joonsoo Kim
2014-12-30 14:41     ` Michal Nazarewicz
2014-12-30 14:46       ` Safonov Dmitry
2014-12-29  2:36 ` [PATCH 0/3] mm: cma: /proc/cmainfo Minchan Kim
2014-12-29 19:52   ` Laura Abbott
2014-12-30  4:47     ` Minchan Kim
2014-12-30 22:00       ` Laura Abbott
2014-12-31  0:25         ` Minchan Kim
2015-01-21 13:52           ` Stefan Strogin
2015-01-23  6:33             ` Joonsoo Kim
2014-12-31  0:58       ` Gioh Kim
2014-12-31  2:18         ` Minchan Kim
2014-12-31  2:45           ` Gioh Kim
2014-12-31  6:47         ` Namhyung Kim
2014-12-31  7:32           ` Minchan Kim
2015-01-09 14:19       ` Steven Rostedt
2015-01-09 14:35         ` Steven Rostedt
2015-01-13  2:27         ` Minchan Kim
2015-01-02  5:11   ` Pavel Machek
2015-01-22 15:44     ` Stefan Strogin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).