[ 02/13] workqueue: consider work function when searching for busy work items

stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Tejun Heo <tj@kernel.org>,
	Andrey Isakov <andy51@gmx.ru>, Li Zefan <lizefan@huawei.com>
Subject: [ 02/13] workqueue: consider work function when searching for busy work items
Date: Mon, 26 Aug 2013 18:08:34 -0700	[thread overview]
Message-ID: <20130827010310.032641422@linuxfoundation.org> (raw)
In-Reply-To: <20130827010309.850395966@linuxfoundation.org>

3.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tejun Heo <tj@kernel.org>

commit a2c1c57be8d9fd5b716113c8991d3d702eeacf77 upstream.

To avoid executing the same work item concurrenlty, workqueue hashes
currently busy workers according to their current work items and looks
up the the table when it wants to execute a new work item.  If there
already is a worker which is executing the new work item, the new item
is queued to the found worker so that it gets executed only after the
current execution finishes.

Unfortunately, a work item may be freed while being executed and thus
recycled for different purposes.  If it gets recycled for a different
work item and queued while the previous execution is still in
progress, workqueue may make the new work item wait for the old one
although the two aren't really related in any way.

In extreme cases, this false dependency may lead to deadlock although
it's extremely unlikely given that there aren't too many self-freeing
work item users and they usually don't wait for other work items.

To alleviate the problem, record the current work function in each
busy worker and match it together with the work item address in
find_worker_executing_work().  While this isn't complete, it ensures
that unrelated work items don't interact with each other and in the
very unlikely case where a twisted wq user triggers it, it's always
onto itself making the culprit easy to spot.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Andrey Isakov <andy51@gmx.ru>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51701
[lizf: Backported to 3.4:
 - Adjust context
 - Incorporate earlier logging cleanup in process_one_work() from
   044c782ce3a9 ('workqueue: fix checkpatch issues')]
Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/workqueue.c |   42 +++++++++++++++++++++++++++++++-----------
 1 file changed, 31 insertions(+), 11 deletions(-)

--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -128,6 +128,7 @@ struct worker {
 	};
 
 	struct work_struct	*current_work;	/* L: work being processed */
+	work_func_t		current_func;	/* L: current_work's fn */
 	struct cpu_workqueue_struct *current_cwq; /* L: current_work's cwq */
 	struct list_head	scheduled;	/* L: scheduled works */
 	struct task_struct	*task;		/* I: worker task */
@@ -838,7 +839,8 @@ static struct worker *__find_worker_exec
 	struct hlist_node *tmp;
 
 	hlist_for_each_entry(worker, tmp, bwh, hentry)
-		if (worker->current_work == work)
+		if (worker->current_work == work &&
+		    worker->current_func == work->func)
 			return worker;
 	return NULL;
 }
@@ -848,9 +850,27 @@ static struct worker *__find_worker_exec
  * @gcwq: gcwq of interest
  * @work: work to find worker for
  *
- * Find a worker which is executing @work on @gcwq.  This function is
- * identical to __find_worker_executing_work() except that this
- * function calculates @bwh itself.
+ * Find a worker which is executing @work on @gcwq by searching
+ * @gcwq->busy_hash which is keyed by the address of @work.  For a worker
+ * to match, its current execution should match the address of @work and
+ * its work function.  This is to avoid unwanted dependency between
+ * unrelated work executions through a work item being recycled while still
+ * being executed.
+ *
+ * This is a bit tricky.  A work item may be freed once its execution
+ * starts and nothing prevents the freed area from being recycled for
+ * another work item.  If the same work item address ends up being reused
+ * before the original execution finishes, workqueue will identify the
+ * recycled work item as currently executing and make it wait until the
+ * current execution finishes, introducing an unwanted dependency.
+ *
+ * This function checks the work item address, work function and workqueue
+ * to avoid false positives.  Note that this isn't complete as one may
+ * construct a work function which can introduce dependency onto itself
+ * through a recycled work item.  Well, if somebody wants to shoot oneself
+ * in the foot that badly, there's only so much we can do, and if such
+ * deadlock actually occurs, it should be easy to locate the culprit work
+ * function.
  *
  * CONTEXT:
  * spin_lock_irq(gcwq->lock).
@@ -1811,7 +1831,6 @@ __acquires(&gcwq->lock)
 	struct global_cwq *gcwq = cwq->gcwq;
 	struct hlist_head *bwh = busy_worker_head(gcwq, work);
 	bool cpu_intensive = cwq->wq->flags & WQ_CPU_INTENSIVE;
-	work_func_t f = work->func;
 	int work_color;
 	struct worker *collision;
 #ifdef CONFIG_LOCKDEP
@@ -1840,6 +1859,7 @@ __acquires(&gcwq->lock)
 	debug_work_deactivate(work);
 	hlist_add_head(&worker->hentry, bwh);
 	worker->current_work = work;
+	worker->current_func = work->func;
 	worker->current_cwq = cwq;
 	work_color = get_work_color(work);
 
@@ -1877,7 +1897,7 @@ __acquires(&gcwq->lock)
 	lock_map_acquire_read(&cwq->wq->lockdep_map);
 	lock_map_acquire(&lockdep_map);
 	trace_workqueue_execute_start(work);
-	f(work);
+	worker->current_func(work);
 	/*
 	 * While we must be careful to not use "work" after this, the trace
 	 * point will only record its address.
@@ -1887,11 +1907,10 @@ __acquires(&gcwq->lock)
 	lock_map_release(&cwq->wq->lockdep_map);
 
 	if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
-		printk(KERN_ERR "BUG: workqueue leaked lock or atomic: "
-		       "%s/0x%08x/%d\n",
-		       current->comm, preempt_count(), task_pid_nr(current));
-		printk(KERN_ERR "    last function: ");
-		print_symbol("%s\n", (unsigned long)f);
+		pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d\n"
+		       "     last function: %pf\n",
+		       current->comm, preempt_count(), task_pid_nr(current),
+		       worker->current_func);
 		debug_show_held_locks(current);
 		dump_stack();
 	}
@@ -1905,6 +1924,7 @@ __acquires(&gcwq->lock)
 	/* we're done with it, release */
 	hlist_del_init(&worker->hentry);
 	worker->current_work = NULL;
+	worker->current_func = NULL;
 	worker->current_cwq = NULL;
 	cwq_dec_nr_in_flight(cwq, work_color, false);
 }

next prev parent reply	other threads:[~2013-08-27  1:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-27  1:08 [ 00/13] 3.4.60-stable review Greg Kroah-Hartman
2013-08-27  1:08 ` [ 01/13] workqueue: fix possible stall on try_to_grab_pending() of a delayed work item Greg Kroah-Hartman
2013-08-27  1:08 ` Greg Kroah-Hartman [this message]
2013-08-27  1:08 ` [ 03/13] zd1201: do not use stack as URB transfer_buffer Greg Kroah-Hartman
2013-08-27  1:08 ` [ 04/13] xen/events: initialize local per-cpu mask for all possible events Greg Kroah-Hartman
2013-08-27  1:08 ` [ 05/13] drm/i915: Invalidate TLBs for the rings after a reset Greg Kroah-Hartman
2013-08-27  1:08 ` [ 06/13] of: fdt: fix memory initialization for expanded DT Greg Kroah-Hartman
2013-08-27  1:08 ` [ 07/13] nilfs2: remove double bio_put() in nilfs_end_bio_write() for BIO_EOPNOTSUPP error Greg Kroah-Hartman
2013-08-27  1:08 ` [ 08/13] nilfs2: fix issue with counting number of bio requests for BIO_EOPNOTSUPP error detection Greg Kroah-Hartman
2013-08-27  1:08 ` [ 09/13] Hostap: copying wrong data prism2_ioctl_giwaplist() Greg Kroah-Hartman
2013-08-27  1:08 ` [ 10/13] libata: apply behavioral quirks to sil3826 PMP Greg Kroah-Hartman
2013-08-27  1:08 ` [ 11/13] SCSI: zfcp: fix lock imbalance by reworking request queue locking Greg Kroah-Hartman
2013-08-27  1:08 ` [ 12/13] SCSI: zfcp: fix schedule-inside-lock in scsi_device list loops Greg Kroah-Hartman
2013-08-27  1:08 ` [ 13/13] x86/xen: do not identity map UNUSABLE regions in the machine E820 Greg Kroah-Hartman
2013-08-27  4:22 ` [ 00/13] 3.4.60-stable review Guenter Roeck
2013-08-27 22:31   ` Greg Kroah-Hartman
2013-08-27 20:42 ` Shuah Khan
2013-08-27 22:28   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130827010310.032641422@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=andy51@gmx.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).