public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Tejun Heo <tj@kernel.org>,
	Andrey Isakov <andy51@gmx.ru>
Subject: [ 60/77] workqueue: consider work function when searching for busy work items
Date: Fri,  1 Mar 2013 11:44:45 -0800	[thread overview]
Message-ID: <20130301194358.287088221@linuxfoundation.org> (raw)
In-Reply-To: <20130301194351.913471337@linuxfoundation.org>

3.8-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tejun Heo <tj@kernel.org>

commit a2c1c57be8d9fd5b716113c8991d3d702eeacf77 upstream.

To avoid executing the same work item concurrenlty, workqueue hashes
currently busy workers according to their current work items and looks
up the the table when it wants to execute a new work item.  If there
already is a worker which is executing the new work item, the new item
is queued to the found worker so that it gets executed only after the
current execution finishes.

Unfortunately, a work item may be freed while being executed and thus
recycled for different purposes.  If it gets recycled for a different
work item and queued while the previous execution is still in
progress, workqueue may make the new work item wait for the old one
although the two aren't really related in any way.

In extreme cases, this false dependency may lead to deadlock although
it's extremely unlikely given that there aren't too many self-freeing
work item users and they usually don't wait for other work items.

To alleviate the problem, record the current work function in each
busy worker and match it together with the work item address in
find_worker_executing_work().  While this isn't complete, it ensures
that unrelated work items don't interact with each other and in the
very unlikely case where a twisted wq user triggers it, it's always
onto itself making the culprit easy to spot.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Andrey Isakov <andy51@gmx.ru>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51701
Cc: stable@vger.kernel.org

---
 kernel/workqueue.c |   36 +++++++++++++++++++++++++++++-------
 1 file changed, 29 insertions(+), 7 deletions(-)

--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -138,6 +138,7 @@ struct worker {
 	};
 
 	struct work_struct	*current_work;	/* L: work being processed */
+	work_func_t		current_func;	/* L: current_work's fn */
 	struct cpu_workqueue_struct *current_cwq; /* L: current_work's cwq */
 	struct list_head	scheduled;	/* L: scheduled works */
 	struct task_struct	*task;		/* I: worker task */
@@ -910,7 +911,8 @@ static struct worker *__find_worker_exec
 	struct hlist_node *tmp;
 
 	hlist_for_each_entry(worker, tmp, bwh, hentry)
-		if (worker->current_work == work)
+		if (worker->current_work == work &&
+		    worker->current_func == work->func)
 			return worker;
 	return NULL;
 }
@@ -920,9 +922,27 @@ static struct worker *__find_worker_exec
  * @gcwq: gcwq of interest
  * @work: work to find worker for
  *
- * Find a worker which is executing @work on @gcwq.  This function is
- * identical to __find_worker_executing_work() except that this
- * function calculates @bwh itself.
+ * Find a worker which is executing @work on @gcwq by searching
+ * @gcwq->busy_hash which is keyed by the address of @work.  For a worker
+ * to match, its current execution should match the address of @work and
+ * its work function.  This is to avoid unwanted dependency between
+ * unrelated work executions through a work item being recycled while still
+ * being executed.
+ *
+ * This is a bit tricky.  A work item may be freed once its execution
+ * starts and nothing prevents the freed area from being recycled for
+ * another work item.  If the same work item address ends up being reused
+ * before the original execution finishes, workqueue will identify the
+ * recycled work item as currently executing and make it wait until the
+ * current execution finishes, introducing an unwanted dependency.
+ *
+ * This function checks the work item address, work function and workqueue
+ * to avoid false positives.  Note that this isn't complete as one may
+ * construct a work function which can introduce dependency onto itself
+ * through a recycled work item.  Well, if somebody wants to shoot oneself
+ * in the foot that badly, there's only so much we can do, and if such
+ * deadlock actually occurs, it should be easy to locate the culprit work
+ * function.
  *
  * CONTEXT:
  * spin_lock_irq(gcwq->lock).
@@ -2168,7 +2188,6 @@ __acquires(&gcwq->lock)
 	struct global_cwq *gcwq = pool->gcwq;
 	struct hlist_head *bwh = busy_worker_head(gcwq, work);
 	bool cpu_intensive = cwq->wq->flags & WQ_CPU_INTENSIVE;
-	work_func_t f = work->func;
 	int work_color;
 	struct worker *collision;
 #ifdef CONFIG_LOCKDEP
@@ -2208,6 +2227,7 @@ __acquires(&gcwq->lock)
 	debug_work_deactivate(work);
 	hlist_add_head(&worker->hentry, bwh);
 	worker->current_work = work;
+	worker->current_func = work->func;
 	worker->current_cwq = cwq;
 	work_color = get_work_color(work);
 
@@ -2240,7 +2260,7 @@ __acquires(&gcwq->lock)
 	lock_map_acquire_read(&cwq->wq->lockdep_map);
 	lock_map_acquire(&lockdep_map);
 	trace_workqueue_execute_start(work);
-	f(work);
+	worker->current_func(work);
 	/*
 	 * While we must be careful to not use "work" after this, the trace
 	 * point will only record its address.
@@ -2252,7 +2272,8 @@ __acquires(&gcwq->lock)
 	if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
 		pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d\n"
 		       "     last function: %pf\n",
-		       current->comm, preempt_count(), task_pid_nr(current), f);
+		       current->comm, preempt_count(), task_pid_nr(current),
+		       worker->current_func);
 		debug_show_held_locks(current);
 		dump_stack();
 	}
@@ -2266,6 +2287,7 @@ __acquires(&gcwq->lock)
 	/* we're done with it, release */
 	hlist_del_init(&worker->hentry);
 	worker->current_work = NULL;
+	worker->current_func = NULL;
 	worker->current_cwq = NULL;
 	cwq_dec_nr_in_flight(cwq, work_color);
 }



  parent reply	other threads:[~2013-03-01 20:11 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-01 19:43 [ 00/77] 3.8.2-stable review Greg Kroah-Hartman
2013-03-01 19:43 ` [ 01/77] mm: do not grow the stack vma just because of an overrun on preceding vma Greg Kroah-Hartman
2013-03-01 19:43 ` [ 02/77] ALSA: bt87x: Make load_all parameter working again Greg Kroah-Hartman
2013-03-01 19:43 ` [ 03/77] ALSA: hda - hdmi: Make jacks phantom, if theyre not detectable Greg Kroah-Hartman
2013-03-01 19:43 ` [ 04/77] ALSA: emu10k1: Fix regression in emu1010 firmware loading Greg Kroah-Hartman
2013-03-01 19:43 ` [ 05/77] ALSA: emu10k1: Load firmware when it was already cached Greg Kroah-Hartman
2013-03-01 19:43 ` [ 06/77] IB/srp: Track connection state properly Greg Kroah-Hartman
2013-03-01 19:43 ` [ 07/77] IB/srp: Avoid sending a task management function needlessly Greg Kroah-Hartman
2013-03-01 19:43 ` [ 08/77] IB/srp: Avoid endless SCSI error handling loop Greg Kroah-Hartman
2013-03-01 19:43 ` [ 09/77] IB/srp: Fail I/O requests if the transport is offline Greg Kroah-Hartman
2013-03-01 19:43 ` [ 10/77] quota: autoload the quota_v2 module for QFMT_VFS_V1 quota format Greg Kroah-Hartman
2013-03-01 19:43 ` [ 11/77] usb: dwc3: Enable usb2 LPM only when connected as usb2.0 Greg Kroah-Hartman
2013-03-01 19:43 ` [ 12/77] usb: dwc3: gadget: fix missed isoc Greg Kroah-Hartman
2013-03-01 19:43 ` [ 13/77] usb: dwc3: gadget: fix isoc END TRANSFER Condition Greg Kroah-Hartman
2013-03-01 19:43 ` [ 14/77] usb: dwc3: gadget: fix skip LINK_TRB on ISOC Greg Kroah-Hartman
2013-03-01 19:44 ` [ 15/77] usb: dwc3: gadget: change HIRD threshold to 12 Greg Kroah-Hartman
2013-03-01 19:44 ` [ 16/77] b43: Fix lockdep splat on module unload Greg Kroah-Hartman
2013-03-01 19:44 ` [ 17/77] UBIFS: fix use of freed ubifs_orphan objects Greg Kroah-Hartman
2013-03-01 19:44 ` [ 18/77] UBIFS: fix double free of " Greg Kroah-Hartman
2013-03-01 19:44 ` [ 19/77] iommu/amd: Initialize device table after dma_ops Greg Kroah-Hartman
2013-03-01 19:44 ` [ 20/77] posix-timer: Dont call idr_find() with out-of-range ID Greg Kroah-Hartman
2013-03-01 19:44 ` [ 21/77] ftrace: Call ftrace cleanup module notifier after all other notifiers Greg Kroah-Hartman
2013-03-01 19:44 ` [ 22/77] x86/apic: Fix parsing of the lapic cmdline option Greg Kroah-Hartman
2013-03-01 19:44 ` [ 23/77] x86, efi: Make "noefi" really disable EFI runtime serivces Greg Kroah-Hartman
2013-03-01 19:44 ` [ 24/77] doc, xen: Mention earlyprintk=xen in the documentation Greg Kroah-Hartman
2013-03-01 19:44 ` [ 25/77] doc, kernel-parameters: Document console=hvc<n> Greg Kroah-Hartman
2013-03-01 19:44 ` [ 26/77] x86: Make sure we can boot in the case the BDA contains pure garbage Greg Kroah-Hartman
2013-03-01 19:44 ` [ 27/77] target: Fix lookup of dynamic NodeACLs during cached demo-mode operation Greg Kroah-Hartman
2013-03-01 19:44 ` [ 28/77] target: Add missing mapped_lun bounds checking during make_mappedlun setup Greg Kroah-Hartman
2013-03-01 19:44 ` [ 29/77] ocfs2: fix possible use-after-free with AIO Greg Kroah-Hartman
2013-03-01 19:44 ` [ 30/77] ocfs2: fix ocfs2_init_security_and_acl() to initialize acl correctly Greg Kroah-Hartman
2013-03-01 19:44 ` [ 31/77] ocfs2: ac->ac_allow_chain_relink=0 wont disable group relink Greg Kroah-Hartman
2013-03-01 19:44 ` [ 32/77] block: fix ext_devt_idr handling Greg Kroah-Hartman
2013-03-01 19:44 ` [ 33/77] xen-blkback: do not leak mode property Greg Kroah-Hartman
2013-03-01 19:44 ` [ 34/77] xen/blkback: Dont trust the handle from the frontend Greg Kroah-Hartman
2013-03-01 21:12   ` Paul Bolle
2013-03-02 19:48     ` Ben Hutchings
2013-03-02 22:35       ` Paul Bolle
2013-03-02 23:10         ` Ben Hutchings
2013-03-03 10:20           ` Paul Bolle
2013-03-04  2:45             ` Greg Kroah-Hartman
2013-03-04  7:55             ` Jan Beulich
2013-03-04  9:11               ` Paul Bolle
2013-03-04  9:14                 ` Jan Beulich
2013-03-04 15:02                   ` Konrad Rzeszutek Wilk
2013-03-12 22:10                     ` Greg Kroah-Hartman
2013-04-03 14:01                       ` William Dauchy
2013-04-03 16:01                         ` Greg Kroah-Hartman
2013-04-03 16:38                           ` Konrad Rzeszutek Wilk
2013-04-03 17:08                             ` Greg Kroah-Hartman
2013-03-01 19:44 ` [ 35/77] xen-blkfront: drop the use of llist_for_each_entry_safe Greg Kroah-Hartman
2013-03-01 19:44 ` [ 36/77] xen-blkback: use balloon pages for persistent grants Greg Kroah-Hartman
2013-03-01 19:44 ` [ 37/77] idr: fix a subtle bug in idr_get_next() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 38/77] block: fix synchronization and limit check in blk_alloc_devt() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 39/77] firewire: add minor number range check to fw_device_init() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 40/77] sysctl: fix null checking in bin_dn_node_address() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 41/77] nbd: fsync and kill block device on shutdown Greg Kroah-Hartman
2013-03-01 19:44 ` [ 42/77] fs: Fix possible use-after-free with AIO Greg Kroah-Hartman
2013-03-01 19:44 ` [ 43/77] ext4: fix " Greg Kroah-Hartman
2013-03-01 19:44 ` [ 44/77] media: cx18/ivtv: fix regression: remove __init from a non-init function Greg Kroah-Hartman
2013-03-01 19:44 ` [ 45/77] media: v4l: Reset subdev v4l2_dev field to NULL if registration fails Greg Kroah-Hartman
2013-03-01 19:44 ` [ 46/77] media: omap_vout: find_vma() needs ->mmap_sem held Greg Kroah-Hartman
2013-03-01 19:44 ` [ 47/77] media: rc: unlock on error in show_protocols() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 48/77] ext4: return ENOMEM if sb_getblk() fails Greg Kroah-Hartman
2013-03-01 19:44 ` [ 49/77] ext4: check bh in ext4_read_block_bitmap() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 50/77] ext4: release sysfs kobject when failing to enable quotas on mount Greg Kroah-Hartman
2013-03-01 19:44 ` [ 51/77] ext4: fix race in ext4_mb_add_n_trim() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 52/77] ext4: fix xattr block allocation/release with bigalloc Greg Kroah-Hartman
2013-03-01 19:44 ` [ 53/77] ext4: fix free clusters calculation in bigalloc filesystem Greg Kroah-Hartman
2013-03-01 19:44 ` [ 54/77] nfsd: Fix memleak Greg Kroah-Hartman
2013-03-01 19:44 ` [ 55/77] svcrpc: make svc_age_temp_xprts enqueue under sv_lock Greg Kroah-Hartman
2013-03-01 19:44 ` [ 56/77] svcrpc: fix rpc server shutdown races Greg Kroah-Hartman
2013-03-01 19:44 ` [ 57/77] HID: add support for Sony RF receiver with USB product id 0x0374 Greg Kroah-Hartman
2013-03-01 19:44 ` [ 58/77] HID: clean up quirk for Sony RF receivers Greg Kroah-Hartman
2013-03-01 19:44 ` [ 59/77] fuse: dont WARN when nlink is zero Greg Kroah-Hartman
2013-03-01 19:44 ` Greg Kroah-Hartman [this message]
2013-03-01 19:44 ` [ 61/77] pstore: Avoid deadlock in panic and emergency-restart path Greg Kroah-Hartman
2013-03-01 19:44 ` [ 62/77] cpuset: fix cpuset_print_task_mems_allowed() vs rename() race Greg Kroah-Hartman
2013-03-01 19:44 ` [ 63/77] cgroup: fix exit() vs rmdir() race Greg Kroah-Hartman
2013-03-01 19:44 ` [ 64/77] bq27x00_battery: Fix bugs introduced with BQ27425 support Greg Kroah-Hartman
2013-03-01 19:44 ` [ 65/77] ab8500-chargalg: Only root should have write permission on sysfs file Greg Kroah-Hartman
2013-03-01 19:44 ` [ 66/77] ab8500_btemp: Demote initcall sequence Greg Kroah-Hartman
2013-03-01 19:44 ` [ 67/77] ACPI: Add DMI entry for Sony VGN-FW41E_H Greg Kroah-Hartman
2013-03-01 19:44 ` [ 68/77] staging: comedi: check s->async for poll(), read() and write() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 69/77] ata_piix: IDE-mode SATA patch for Intel Avoton DeviceIDs Greg Kroah-Hartman
2013-03-01 19:44 ` [ 70/77] ata_piix: Add Device IDs for Intel Wellsburg PCH Greg Kroah-Hartman
2013-03-01 19:44 ` [ 71/77] ahci: AHCI-mode SATA patch for Intel Avoton DeviceIDs Greg Kroah-Hartman
2013-03-01 19:44 ` [ 72/77] ahci: Add Device IDs for Intel Wellsburg PCH Greg Kroah-Hartman
2013-03-01 19:44 ` [ 73/77] [hid] usb hid quirks for Masterkit MA901 usb radio Greg Kroah-Hartman
2013-03-04 11:05   ` Alexey Klimov
2013-03-04 14:25     ` Ben Hutchings
2013-03-01 19:44 ` [ 74/77] x86, efi: Allow slash in file path of initrd Greg Kroah-Hartman
2013-03-01 19:45 ` [ 75/77] ACPI: Overriding ACPI tables via initrd only works with an initrd and on X86 Greg Kroah-Hartman
2013-03-01 19:45 ` [ 76/77] efivarfs: Validate filenames much more aggressively Greg Kroah-Hartman
2013-03-01 19:45 ` [ 77/77] efivarfs: guid part of filenames are case-insensitive Greg Kroah-Hartman
2013-03-02  3:59 ` [ 00/77] 3.8.2-stable review Shuah Khan
2013-03-02  5:21   ` Greg Kroah-Hartman
2013-03-03 11:49 ` Satoru Takeuchi
2013-03-03 15:26   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130301194358.287088221@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=andy51@gmx.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox