public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET] workqueue, writeback: better worker information in task dumps
@ 2013-03-30  3:00 Tejun Heo
  2013-03-30  3:00 ` [PATCH 1/3] kthread: implement probe_kthread_data() Tejun Heo
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Tejun Heo @ 2013-03-30  3:00 UTC (permalink / raw)
  To: axboe, akpm; +Cc: jack, david, linux-kernel

Hello,

One of the problems that arise when converting dedicated custom
threadpool to workqueue is that the shared worker pool used by
workqueue anonimizes each worker making it more difficult to identify
what the worker was doing on which target from the output of sysrq-t
or debug dump from oops, BUG() and friends.

For example, after writeback is converted to use workqueue instead of
priviate thread pool, there's no easy to tell which backing device a
writeback work item was working on at the time of task dump, which,
according to our writeback brethren, is important in tracking down
issues with a lot of mounted file systems on a lot of different
devices.

This patchset implements a way for a work function to mark its
execution instance so that task dump of the worker task includes
information to indicate what the work item was doing.

An example WARN dump would look like the following.

 WARNING: at /work/os/work/fs/fs-writeback.c:1015 bdi_writeback_workfn+0x2b4/0x3c0()
 Modules linked in:
 Pid: 28, comm: kworker/u18:0 Not tainted 3.9.0-rc1-work+ #24 empty empty/S3992
 Workqueue: writeback bdi_writeback_workfn (flush-8:16)
  ffffffff820a3a98 ffff88015b927cb8 ffffffff81c61855 ffff88015b927cf8
  ffffffff8108f500 0000000000000000 ffff88007a171948 ffff88007a1716b0
  ffff88015b49df00 ffff88015b8d3940 0000000000000000 ffff88015b927d08
 Call Trace:
  [<ffffffff81c61855>] dump_stack+0x19/0x1b
  [<ffffffff8108f500>] warn_slowpath_common+0x70/0xa0
  ...

This patchset contains the following three patches.

 0001-kthread-implement-probe_kthread_data.patch
 0002-workqueue-include-workqueue-info-when-printing-debug.patch
 0003-writeback-set-worker-desc-to-identify-writeback-work.patch

0001 adds a speculative accessor for kthread_data.

0002 implements worker description.

0003 updates writeback to set its flusher name as worker description.

This patchset is on top of

  [1] workqueue: NUMA affinity for unbound workqueues
+ [2] writeback: convert writeback to unbound workqueue
+ [3] arch: unify task dump debug info

and available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git review-better-dbg

Due to the dependencies, routing this set would be a bit complicated.
I think it would be best to first pull wq/for-3.10 to block tree and
apply workqueue conversion patchset.  The patches in this series then
can go through -mm on top of all three pieces - workqueue, block and
archs.

diffstat follows.  Thanks.

 fs/fs-writeback.c           |    1 
 include/linux/kthread.h     |    1 
 include/linux/workqueue.h   |    5 ++
 kernel/kthread.c            |   19 ++++++++++
 kernel/sched/core.c         |    1 
 kernel/workqueue.c          |   79 ++++++++++++++++++++++++++++++++++++++++++++
 kernel/workqueue_internal.h |   12 ++++++
 lib/dump_stack.c            |    2 +
 8 files changed, 119 insertions(+), 1 deletion(-)

--
tejun

[1] http://thread.gmane.org/gmane.linux.kernel/1465626
[2] http://thread.gmane.org/gmane.linux.kernel/1453006
[3] https://lkml.org/lkml/2013/3/29/354

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] kthread: implement probe_kthread_data()
  2013-03-30  3:00 [PATCHSET] workqueue, writeback: better worker information in task dumps Tejun Heo
@ 2013-03-30  3:00 ` Tejun Heo
  2013-03-30 14:36   ` Oleg Nesterov
  2013-04-03  9:26   ` Jan Kara
  2013-03-30  3:00 ` [PATCH 2/3] workqueue: include workqueue info when printing debug dump of a worker task Tejun Heo
  2013-03-30  3:00 ` [PATCH 3/3] writeback: set worker desc to identify writeback workers in task dumps Tejun Heo
  2 siblings, 2 replies; 10+ messages in thread
From: Tejun Heo @ 2013-03-30  3:00 UTC (permalink / raw)
  To: axboe, akpm; +Cc: jack, david, linux-kernel, Tejun Heo, Oleg Nesterov

Implement probe_kthread_data() which returns kthread_data if
accessible.  The function is equivalent to kthread_data() except that
the specified @task may not be a kthread or its vfork_done is already
cleared rendering struct kthread inaccessible.  In the former case,
probe_kthread_data() may return any value.  In the latter, NULL.

This will be used to safely print debug information without affecting
synchronization in the normal paths.  Workqueue debug info printing on
dump_stack() and friends will make use of it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 include/linux/kthread.h |  1 +
 kernel/kthread.c        | 19 +++++++++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index 8d81664..7dcef33 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -43,6 +43,7 @@ bool kthread_should_stop(void);
 bool kthread_should_park(void);
 bool kthread_freezable_should_stop(bool *was_frozen);
 void *kthread_data(struct task_struct *k);
+void *probe_kthread_data(struct task_struct *k);
 int kthread_park(struct task_struct *k);
 void kthread_unpark(struct task_struct *k);
 void kthread_parkme(void);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index a2fbbb7..31696c8 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -17,6 +17,7 @@
 #include <linux/slab.h>
 #include <linux/freezer.h>
 #include <linux/ptrace.h>
+#include <linux/uaccess.h>
 #include <trace/events/sched.h>
 
 static DEFINE_SPINLOCK(kthread_create_lock);
@@ -122,6 +123,24 @@ void *kthread_data(struct task_struct *task)
 	return to_kthread(task)->data;
 }
 
+/**
+ * probe_kthread_data - speculatively version of kthread_data()
+ * @task: possible kthread task in question
+ *
+ * @task could be a kthread task.  Return the data value specified when it
+ * was created if accessible.  If @task isn't a kthread task or its data is
+ * inaccessible for any reason, %NULL is returned.  This function requires
+ * that @task itself is safe to dereference.
+ */
+void *probe_kthread_data(struct task_struct *task)
+{
+	struct kthread *kthread = to_kthread(task);
+	void *data = NULL;
+
+	probe_kernel_read(&data, &kthread->data, sizeof(data));
+	return data;
+}
+
 static void __kthread_parkme(struct kthread *self)
 {
 	__set_current_state(TASK_INTERRUPTIBLE);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] workqueue: include workqueue info when printing debug dump of a worker task
  2013-03-30  3:00 [PATCHSET] workqueue, writeback: better worker information in task dumps Tejun Heo
  2013-03-30  3:00 ` [PATCH 1/3] kthread: implement probe_kthread_data() Tejun Heo
@ 2013-03-30  3:00 ` Tejun Heo
  2013-03-30  3:00 ` [PATCH 3/3] writeback: set worker desc to identify writeback workers in task dumps Tejun Heo
  2 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2013-03-30  3:00 UTC (permalink / raw)
  To: axboe, akpm
  Cc: jack, david, linux-kernel, Tejun Heo, Peter Zijlstra, Ingo Molnar

One of the problems that arise when converting dedicated custom
threadpool to workqueue is that the shared worker pool used by
workqueue anonimizes each worker making it more difficult to identify
what the worker was doing on which target from the output of sysrq-t
or debug dump from oops, BUG() and friends.

This patch implements set_worker_desc() which can be called from any
workqueue work function to set its description.  When the worker task
is dumped for whatever reason - sysrq-t, WARN, BUG, oops, lockdep
assertion and so on - the description will be printed out together
with the workqueue name and the worker function pointer.

The printing side is implemented by print_worker_info() which is
called from functions in task dump paths - sched_show_task() and
dump_stack_print_info().  print_worker_info() can be safely called on
any task in any state as long as the task struct itself is accessible.
It uses probe_*() functions to access worker fields.  It may print
garbage if something went very wrong, but it wouldn't cause (another)
oops.

The description is currently limited to 24bytes including the
terminating \0.  worker->desc_valid and workder->desc[] are added and
the 64 bytes marker which was already incorrect before adding the new
fields is moved to the correct position.

Here's an example dump with writeback updated to set the bdi name as
worker desc.

 Hardware name: Bochs
 Modules linked in:
 Pid: 7, comm: kworker/u9:0 Not tainted 3.9.0-rc1-work+ #1
 Workqueue: writeback bdi_writeback_workfn (flush-8:0)
  ffffffff820a3ab0 ffff88000f6e9cb8 ffffffff81c61845 ffff88000f6e9cf8
  ffffffff8108f50f 0000000000000000 0000000000000000 ffff88000cde16b0
  ffff88000cde1aa8 ffff88001ee19240 ffff88000f6e9fd8 ffff88000f6e9d08
 Call Trace:
  [<ffffffff81c61845>] dump_stack+0x19/0x1b
  [<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0
  [<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff81200150>] bdi_writeback_workfn+0x2a0/0x3b0
 ...

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Dave Chinner <david@fromorbit.com>
---
 include/linux/workqueue.h   |  5 +++
 kernel/sched/core.c         |  1 +
 kernel/workqueue.c          | 79 +++++++++++++++++++++++++++++++++++++++++++++
 kernel/workqueue_internal.h | 12 ++++++-
 lib/dump_stack.c            |  2 ++
 5 files changed, 98 insertions(+), 1 deletion(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 7179756..623488f 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -92,6 +92,9 @@ enum {
 	/* bit mask for work_busy() return values */
 	WORK_BUSY_PENDING	= 1 << 0,
 	WORK_BUSY_RUNNING	= 1 << 1,
+
+	/* maximum string length for set_worker_desc() */
+	WORKER_DESC_LEN		= 24,
 };
 
 struct work_struct {
@@ -447,6 +450,8 @@ extern void workqueue_set_max_active(struct workqueue_struct *wq,
 extern bool current_is_workqueue_rescuer(void);
 extern bool workqueue_congested(int cpu, struct workqueue_struct *wq);
 extern unsigned int work_busy(struct work_struct *work);
+extern __printf(1, 2) void set_worker_desc(const char *fmt, ...);
+extern void print_worker_info(const char *log_lvl, struct task_struct *task);
 
 /**
  * queue_work - queue work on a workqueue
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 23606ee..aba4ccd 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4630,6 +4630,7 @@ void sched_show_task(struct task_struct *p)
 		task_pid_nr(p), ppid,
 		(unsigned long)task_thread_info(p)->flags);
 
+	print_worker_info(KERN_INFO, p);
 	show_stack(p, NULL);
 }
 
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 80432c7..3181e9b 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -46,6 +46,7 @@
 #include <linux/rculist.h>
 #include <linux/nodemask.h>
 #include <linux/moduleparam.h>
+#include <linux/uaccess.h>
 
 #include "workqueue_internal.h"
 
@@ -2197,6 +2198,7 @@ __acquires(&pool->lock)
 	worker->current_work = NULL;
 	worker->current_func = NULL;
 	worker->current_pwq = NULL;
+	worker->desc_valid = false;
 	pwq_dec_nr_in_flight(pwq, work_color);
 }
 
@@ -4364,6 +4366,83 @@ unsigned int work_busy(struct work_struct *work)
 }
 EXPORT_SYMBOL_GPL(work_busy);
 
+/**
+ * set_worker_desc - set description for the current work item
+ * @fmt: printf-style format string
+ * @...: arguments for the format string
+ *
+ * This function can be called by a running work function to describe what
+ * the work item is about.  If the worker task gets dumped, this
+ * information will be printed out together to help debugging.  The
+ * description can be at most WORKER_DESC_LEN including the trailing '\0'.
+ */
+void set_worker_desc(const char *fmt, ...)
+{
+	struct worker *worker = current_wq_worker();
+	va_list args;
+
+	if (worker) {
+		va_start(args, fmt);
+		vsnprintf(worker->desc, sizeof(worker->desc), fmt, args);
+		va_end(args);
+		worker->desc_valid = true;
+	}
+}
+
+/**
+ * print_worker_info - print out worker information and description
+ * @log_lvl: the log level to use when printing
+ * @task: target task
+ *
+ * If @task is a worker and currently executing a work item, print out the
+ * name of the workqueue being serviced and worker description set with
+ * set_worker_desc() by the currently executing work item.
+ *
+ * This function can be safely called on any task as long as the
+ * task_struct itself is accessible.  While safe, this function isn't
+ * synchronized and may print out mixups or garbages of limited length.
+ */
+void print_worker_info(const char *log_lvl, struct task_struct *task)
+{
+	work_func_t *fn = NULL;
+	char name[WQ_NAME_LEN] = { };
+	char desc[WORKER_DESC_LEN] = { };
+	struct pool_workqueue *pwq = NULL;
+	struct workqueue_struct *wq = NULL;
+	bool desc_valid = false;
+	struct worker *worker;
+
+	if (!(task->flags & PF_WQ_WORKER))
+		return;
+
+	/*
+	 * This function is called without any synchronization and @task
+	 * could be in any state.  Be careful with dereferences.
+	 */
+	worker = probe_kthread_data(task);
+
+	/*
+	 * Carefully copy the associated workqueue's workfn and name.  Keep
+	 * the original last '\0' in case the original contains garbage.
+	 */
+	probe_kernel_read(&fn, &worker->current_func, sizeof(fn));
+	probe_kernel_read(&pwq, &worker->current_pwq, sizeof(pwq));
+	probe_kernel_read(&wq, &pwq->wq, sizeof(wq));
+	probe_kernel_read(name, wq->name, sizeof(name) - 1);
+
+	/* copy worker description */
+	probe_kernel_read(&desc_valid, &worker->desc_valid, sizeof(desc_valid));
+	if (desc_valid)
+		probe_kernel_read(desc, worker->desc, sizeof(desc) - 1);
+
+	if (fn || name[0] || desc[0]) {
+		printk("%sWorkqueue: %s %pf", log_lvl, name, fn);
+		if (desc[0])
+			pr_cont(" (%s)", desc);
+		pr_cont("\n");
+	}
+}
+
 /*
  * CPU hotplug.
  *
diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h
index 84ab6e1..ad83c96 100644
--- a/kernel/workqueue_internal.h
+++ b/kernel/workqueue_internal.h
@@ -29,15 +29,25 @@ struct worker {
 	struct work_struct	*current_work;	/* L: work being processed */
 	work_func_t		current_func;	/* L: current_work's fn */
 	struct pool_workqueue	*current_pwq; /* L: current_work's pwq */
+	bool			desc_valid;	/* ->desc is valid */
 	struct list_head	scheduled;	/* L: scheduled works */
+
+	/* 64 bytes boundary on 64bit, 32 on 32bit */
+
 	struct task_struct	*task;		/* I: worker task */
 	struct worker_pool	*pool;		/* I: the associated pool */
 						/* L: for rescuers */
-	/* 64 bytes boundary on 64bit, 32 on 32bit */
+
 	unsigned long		last_active;	/* L: last active timestamp */
 	unsigned int		flags;		/* X: flags */
 	int			id;		/* I: worker id */
 
+	/*
+	 * Opaque string set with work_set_desc().  Printed out with task
+	 * dump for debugging - WARN, BUG, panic or sysrq.
+	 */
+	char			desc[WORKER_DESC_LEN];
+
 	/* used only by rescuers to point to the target workqueue */
 	struct workqueue_struct	*rescue_wq;	/* I: the workqueue to rescue */
 };
diff --git a/lib/dump_stack.c b/lib/dump_stack.c
index 520a878..776af7b 100644
--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -44,6 +44,8 @@ void dump_stack_print_info(const char *log_lvl)
 	       init_utsname()->release,
 	       (int)strcspn(init_utsname()->version, " "),
 	       init_utsname()->version, dump_stack_arch_desc_str);
+
+	print_worker_info(log_lvl, current);
 }
 
 /**
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] writeback: set worker desc to identify writeback workers in task dumps
  2013-03-30  3:00 [PATCHSET] workqueue, writeback: better worker information in task dumps Tejun Heo
  2013-03-30  3:00 ` [PATCH 1/3] kthread: implement probe_kthread_data() Tejun Heo
  2013-03-30  3:00 ` [PATCH 2/3] workqueue: include workqueue info when printing debug dump of a worker task Tejun Heo
@ 2013-03-30  3:00 ` Tejun Heo
  2 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2013-03-30  3:00 UTC (permalink / raw)
  To: axboe, akpm; +Cc: jack, david, linux-kernel, Tejun Heo

Writeback has been recently converted to use workqueue instead of its
private thread pool implementation.  One negative side effect of this
conversion is that there's no easy to tell which backing device a
writeback work item was working on at the time of task dump, be it
sysrq-t, BUG, WARN or whatever, which, according to our writeback
brethren, is important in tracking down issues with a lot of mounted
file systems on a lot of different devices.

This patch restores that information using the new worker description
facility.  bdi_writeback_workfn() calls set_work_desc() to identify
which bdi it's working on.  The description is printed out together
with the worqueue name and worker function as in the following example
dump.

 WARNING: at /work/os/work/fs/fs-writeback.c:1015 bdi_writeback_workfn+0x2b4/0x3c0()
 Modules linked in:
 Pid: 28, comm: kworker/u18:0 Not tainted 3.9.0-rc1-work+ #24 empty empty/S3992
 Workqueue: writeback bdi_writeback_workfn (flush-8:16)
  ffffffff820a3a98 ffff88015b927cb8 ffffffff81c61855 ffff88015b927cf8
  ffffffff8108f500 0000000000000000 ffff88007a171948 ffff88007a1716b0
  ffff88015b49df00 ffff88015b8d3940 0000000000000000 ffff88015b927d08
 Call Trace:
  [<ffffffff81c61855>] dump_stack+0x19/0x1b
  [<ffffffff8108f500>] warn_slowpath_common+0x70/0xa0
  [<ffffffff8108f54a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff81200144>] bdi_writeback_workfn+0x2b4/0x3c0
  [<ffffffff810b4c87>] process_one_work+0x1d7/0x660
  [<ffffffff810b5c72>] worker_thread+0x122/0x380
  [<ffffffff810bdfea>] kthread+0xea/0xf0
  [<ffffffff81c6cedc>] ret_from_fork+0x7c/0xb0

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
---
 fs/fs-writeback.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 8067d37..3be5718 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1009,6 +1009,7 @@ void bdi_writeback_workfn(struct work_struct *work)
 	struct backing_dev_info *bdi = wb->bdi;
 	long pages_written;
 
+	set_worker_desc("flush-%s", dev_name(bdi->dev));
 	current->flags |= PF_SWAPWRITE;
 
 	if (likely(!current_is_workqueue_rescuer() ||
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] kthread: implement probe_kthread_data()
  2013-03-30  3:00 ` [PATCH 1/3] kthread: implement probe_kthread_data() Tejun Heo
@ 2013-03-30 14:36   ` Oleg Nesterov
  2013-03-30 16:17     ` Tejun Heo
  2013-04-03  9:26   ` Jan Kara
  1 sibling, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2013-03-30 14:36 UTC (permalink / raw)
  To: Tejun Heo; +Cc: axboe, akpm, jack, david, linux-kernel

On 03/29, Tejun Heo wrote:
>
> Implement probe_kthread_data() which returns kthread_data if
> accessible.  The function is equivalent to kthread_data() except that
> the specified @task may not be a kthread
                 ^^^^^^^^^^^^^^^^^^^^^^^^^

This doesn't necessarily mean its ->vfork_done != NULL... but I guess
the "false positive" is fine for your purpose.

> or its vfork_done is already
> cleared rendering struct kthread inaccessible.
> ...

> +/**
> + * probe_kthread_data - speculatively version of kthread_data()
> + * @task: possible kthread task in question
> + *
> + * @task could be a kthread task.  Return the data value specified when it
> + * was created if accessible.  If @task isn't a kthread task or its data is
> + * inaccessible for any reason, %NULL is returned.  This function requires
> + * that @task itself is safe to dereference.
> + */
> +void *probe_kthread_data(struct task_struct *task)
> +{
> +	struct kthread *kthread = to_kthread(task);
> +	void *data = NULL;
> +
> +	probe_kernel_read(&data, &kthread->data, sizeof(data));
> +	return data;
> +}

OK, but we can simply check ->vfork_done != NULL ?

We do not care if we race with the exiting thread which can clear
its ->vfork_done. If it is safe to dereference this @task then
kthread is also safe.

kthread-introduce-to_live_kthread.patch in -mm adds the trivial
helper,

	static inline struct kthread *to_kthread(struct task_struct *k)
	{
	       return __to_kthread(k->vfork_done);
	}

	static struct kthread *to_live_kthread(struct task_struct *k)
	{
	       struct completion *vfork = ACCESS_ONCE(k->vfork_done);
	       if (likely(vfork))
		       return __to_kthread(vfork);
	       return NULL;
	}

So, perhaps,

	voif *kthread_data_safe(struct task_struct *task)
	{
		struct kthread *kthread = to_live_kthread(task);
		return kthread ? kthread->data : NULL;
	}

?

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] kthread: implement probe_kthread_data()
  2013-03-30 14:36   ` Oleg Nesterov
@ 2013-03-30 16:17     ` Tejun Heo
  2013-03-30 17:00       ` Oleg Nesterov
  0 siblings, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2013-03-30 16:17 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jens Axboe, Andrew Morton, Jan Kara, david, lkml

Hello, Oleg.

On Sat, Mar 30, 2013 at 7:36 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> > +void *probe_kthread_data(struct task_struct *task)
> > +{
> > +     struct kthread *kthread = to_kthread(task);
> > +     void *data = NULL;
> > +
> > +     probe_kernel_read(&data, &kthread->data, sizeof(data));
> > +     return data;
> > +}
>
> OK, but we can simply check ->vfork_done != NULL ?

Hmm... what if ->vfork_done is pointing to some weird place?

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] kthread: implement probe_kthread_data()
  2013-03-30 16:17     ` Tejun Heo
@ 2013-03-30 17:00       ` Oleg Nesterov
       [not found]         ` <CAOS58YNMbTY2fZzUXeyLwuA+nowW2AhgLvQwKf_0jWXDen6HzQ@mail.gmail.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2013-03-30 17:00 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jens Axboe, Andrew Morton, Jan Kara, david, lkml

Hi Tejun,

On 03/30, Tejun Heo wrote:
>
> On Sat, Mar 30, 2013 at 7:36 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> > > +void *probe_kthread_data(struct task_struct *task)
> > > +{
> > > +     struct kthread *kthread = to_kthread(task);
> > > +     void *data = NULL;
> > > +
> > > +     probe_kernel_read(&data, &kthread->data, sizeof(data));
> > > +     return data;
> > > +}
> >
> > OK, but we can simply check ->vfork_done != NULL ?
>
> Hmm... what if ->vfork_done is pointing to some weird place?

Aah...  "weird place" is not possible if we know that @task is kthread,
it is either NULL or it points into tsk->stack which can only go away
along with task. Note that kthread_stop() already relies on this.

But I guess I missed the fact that this helper should be safe even this
@task can be the a vfork'ed user-space process, yes?

In this case, yes, ->vfork_done can point to task->parent->stack, not
good...

Perhaps,

	voif *kthread_data_safe(struct task_struct *task)
	{
		if (task->parent == kthreadd_task) {
			struct kthread *kthread = to_live_kthread(task);
			if (kthread)
				return kthread->data;
		}

		return NULL;
	}

?

Or we can add to_live_kthread_safe() which checks "parent == kthreadd_task"
instead.

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] kthread: implement probe_kthread_data()
       [not found]         ` <CAOS58YNMbTY2fZzUXeyLwuA+nowW2AhgLvQwKf_0jWXDen6HzQ@mail.gmail.com>
@ 2013-03-30 19:12           ` Oleg Nesterov
  0 siblings, 0 replies; 10+ messages in thread
From: Oleg Nesterov @ 2013-03-30 19:12 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jens Axboe, Andrew Morton, Jan Kara, david, lkml

On 03/30, Tejun Heo wrote:
>
> On Sat, Mar 30, 2013 at 10:00 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>
> > But I guess I missed the fact that this helper should be safe even this
> > @task can be the a vfork'ed user-space process, yes?
> >
>
> Yeap, we really don't know what to expect.

OK,

> > Or we can add to_live_kthread_safe() which checks "parent == kthreadd_task"
> > instead.
>
> But that would be more fragile than probe_read. I  mean, this thing will be
> called on BUGs and oopses.

OK, I agree.

Sorry for noise and thanks for your explanations, I really missed
the intent.

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] kthread: implement probe_kthread_data()
  2013-03-30  3:00 ` [PATCH 1/3] kthread: implement probe_kthread_data() Tejun Heo
  2013-03-30 14:36   ` Oleg Nesterov
@ 2013-04-03  9:26   ` Jan Kara
  1 sibling, 0 replies; 10+ messages in thread
From: Jan Kara @ 2013-04-03  9:26 UTC (permalink / raw)
  To: Tejun Heo; +Cc: axboe, akpm, jack, david, linux-kernel, Oleg Nesterov

On Fri 29-03-13 20:00:45, Tejun Heo wrote:
> Implement probe_kthread_data() which returns kthread_data if
> accessible.  The function is equivalent to kthread_data() except that
> the specified @task may not be a kthread or its vfork_done is already
> cleared rendering struct kthread inaccessible.  In the former case,
> probe_kthread_data() may return any value.  In the latter, NULL.
> 
> This will be used to safely print debug information without affecting
> synchronization in the normal paths.  Workqueue debug info printing on
> dump_stack() and friends will make use of it.
  Thanks for looking into this. Just one typo correction below:

> @@ -122,6 +123,24 @@ void *kthread_data(struct task_struct *task)
>  	return to_kthread(task)->data;
>  }
>  
> +/**
> + * probe_kthread_data - speculatively version of kthread_data()
                                   ^^^^ speculative

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] kthread: implement probe_kthread_data()
  2013-04-03 19:24 [PATCHSET v2] workqueue, writeback: better worker information " Tejun Heo
@ 2013-04-03 19:24 ` Tejun Heo
  0 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2013-04-03 19:24 UTC (permalink / raw)
  To: axboe, akpm; +Cc: jack, david, linux-kernel, Tejun Heo, Oleg Nesterov

Implement probe_kthread_data() which returns kthread_data if
accessible.  The function is equivalent to kthread_data() except that
the specified @task may not be a kthread or its vfork_done is already
cleared rendering struct kthread inaccessible.  In the former case,
probe_kthread_data() may return any value.  In the latter, NULL.

This will be used to safely print debug information without affecting
synchronization in the normal paths.  Workqueue debug info printing on
dump_stack() and friends will make use of it.

v2: Spelling error in comment fixed as suggested by Jan Kara.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
---
 include/linux/kthread.h |  1 +
 kernel/kthread.c        | 19 +++++++++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index 8d81664..7dcef33 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -43,6 +43,7 @@ bool kthread_should_stop(void);
 bool kthread_should_park(void);
 bool kthread_freezable_should_stop(bool *was_frozen);
 void *kthread_data(struct task_struct *k);
+void *probe_kthread_data(struct task_struct *k);
 int kthread_park(struct task_struct *k);
 void kthread_unpark(struct task_struct *k);
 void kthread_parkme(void);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index b9db231..a279c55 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -17,6 +17,7 @@
 #include <linux/slab.h>
 #include <linux/freezer.h>
 #include <linux/ptrace.h>
+#include <linux/uaccess.h>
 #include <trace/events/sched.h>
 
 static DEFINE_SPINLOCK(kthread_create_lock);
@@ -135,6 +136,24 @@ void *kthread_data(struct task_struct *task)
 	return to_kthread(task)->data;
 }
 
+/**
+ * probe_kthread_data - speculative version of kthread_data()
+ * @task: possible kthread task in question
+ *
+ * @task could be a kthread task.  Return the data value specified when it
+ * was created if accessible.  If @task isn't a kthread task or its data is
+ * inaccessible for any reason, %NULL is returned.  This function requires
+ * that @task itself is safe to dereference.
+ */
+void *probe_kthread_data(struct task_struct *task)
+{
+	struct kthread *kthread = to_kthread(task);
+	void *data = NULL;
+
+	probe_kernel_read(&data, &kthread->data, sizeof(data));
+	return data;
+}
+
 static void __kthread_parkme(struct kthread *self)
 {
 	__set_current_state(TASK_INTERRUPTIBLE);
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-04-03 19:24 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-30  3:00 [PATCHSET] workqueue, writeback: better worker information in task dumps Tejun Heo
2013-03-30  3:00 ` [PATCH 1/3] kthread: implement probe_kthread_data() Tejun Heo
2013-03-30 14:36   ` Oleg Nesterov
2013-03-30 16:17     ` Tejun Heo
2013-03-30 17:00       ` Oleg Nesterov
     [not found]         ` <CAOS58YNMbTY2fZzUXeyLwuA+nowW2AhgLvQwKf_0jWXDen6HzQ@mail.gmail.com>
2013-03-30 19:12           ` Oleg Nesterov
2013-04-03  9:26   ` Jan Kara
2013-03-30  3:00 ` [PATCH 2/3] workqueue: include workqueue info when printing debug dump of a worker task Tejun Heo
2013-03-30  3:00 ` [PATCH 3/3] writeback: set worker desc to identify writeback workers in task dumps Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2013-04-03 19:24 [PATCHSET v2] workqueue, writeback: better worker information " Tejun Heo
2013-04-03 19:24 ` [PATCH 1/3] kthread: implement probe_kthread_data() Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox