All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Mikulas Patocka <mpatocka@redhat.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Martin Peschke <mpeschke@linux.vnet.ibm.com>,
	Steffen Maier <maier@linux.vnet.ibm.com>,
	James Bottomley <JBottomley@Parallels.com>
Subject: [ 63/74] SCSI: zfcp: fix lock imbalance by reworking request queue locking
Date: Mon, 26 Aug 2013 18:09:01 -0700	[thread overview]
Message-ID: <20130827010437.390785185@linuxfoundation.org> (raw)
In-Reply-To: <20130827010424.535365435@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Martin Peschke <mpeschke@linux.vnet.ibm.com>

commit d79ff142624e1be080ad8d09101f7004d79c36e1 upstream.

This patch adds wait_event_interruptible_lock_irq_timeout(), which is a
straight-forward descendant of wait_event_interruptible_timeout() and
wait_event_interruptible_lock_irq().

The zfcp driver used to call wait_event_interruptible_timeout()
in combination with some intricate and error-prone locking. Using
wait_event_interruptible_lock_irq_timeout() as a replacement
nicely cleans up that locking.

This rework removes a situation that resulted in a locking imbalance
in zfcp_qdio_sbal_get():

BUG: workqueue leaked lock or atomic: events/1/0xffffff00/10
    last function: zfcp_fc_wka_port_offline+0x0/0xa0 [zfcp]

It was introduced by commit c2af7545aaff3495d9bf9a7608c52f0af86fb194
"[SCSI] zfcp: Do not wait for SBALs on stopped queue", which had a new
code path related to ZFCP_STATUS_ADAPTER_QDIOUP that took an early exit
without a required lock being held. The problem occured when a
special, non-SCSI I/O request was being submitted in process context,
when the adapter's queues had been torn down. In this case the bug
surfaced when the Fibre Channel port connection for a well-known address
was closed during a concurrent adapter shut-down procedure, which is a
rare constellation.

This patch also fixes these warnings from the sparse tool (make C=1):

drivers/s390/scsi/zfcp_qdio.c:224:12: warning: context imbalance in
 'zfcp_qdio_sbal_check' - wrong count at exit
drivers/s390/scsi/zfcp_qdio.c:244:5: warning: context imbalance in
 'zfcp_qdio_sbal_get' - unexpected unlock

Last but not least, we get rid of that crappy lock-unlock-lock
sequence at the beginning of the critical section.

It is okay to call zfcp_erp_adapter_reopen() with req_q_lock held.

Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/s390/scsi/zfcp_qdio.c |    8 +----
 include/linux/wait.h          |   57 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 59 insertions(+), 6 deletions(-)

--- a/drivers/s390/scsi/zfcp_qdio.c
+++ b/drivers/s390/scsi/zfcp_qdio.c
@@ -224,11 +224,9 @@ int zfcp_qdio_sbals_from_sg(struct zfcp_
 
 static int zfcp_qdio_sbal_check(struct zfcp_qdio *qdio)
 {
-	spin_lock_irq(&qdio->req_q_lock);
 	if (atomic_read(&qdio->req_q_free) ||
 	    !(atomic_read(&qdio->adapter->status) & ZFCP_STATUS_ADAPTER_QDIOUP))
 		return 1;
-	spin_unlock_irq(&qdio->req_q_lock);
 	return 0;
 }
 
@@ -246,9 +244,8 @@ int zfcp_qdio_sbal_get(struct zfcp_qdio
 {
 	long ret;
 
-	spin_unlock_irq(&qdio->req_q_lock);
-	ret = wait_event_interruptible_timeout(qdio->req_q_wq,
-			       zfcp_qdio_sbal_check(qdio), 5 * HZ);
+	ret = wait_event_interruptible_lock_irq_timeout(qdio->req_q_wq,
+		       zfcp_qdio_sbal_check(qdio), qdio->req_q_lock, 5 * HZ);
 
 	if (!(atomic_read(&qdio->adapter->status) & ZFCP_STATUS_ADAPTER_QDIOUP))
 		return -EIO;
@@ -262,7 +259,6 @@ int zfcp_qdio_sbal_get(struct zfcp_qdio
 		zfcp_erp_adapter_reopen(qdio->adapter, 0, "qdsbg_1");
 	}
 
-	spin_lock_irq(&qdio->req_q_lock);
 	return -EIO;
 }
 
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -805,6 +805,63 @@ do {									\
 	__ret;								\
 })
 
+#define __wait_event_interruptible_lock_irq_timeout(wq, condition,	\
+						    lock, ret)		\
+do {									\
+	DEFINE_WAIT(__wait);						\
+									\
+	for (;;) {							\
+		prepare_to_wait(&wq, &__wait, TASK_INTERRUPTIBLE);	\
+		if (condition)						\
+			break;						\
+		if (signal_pending(current)) {				\
+			ret = -ERESTARTSYS;				\
+			break;						\
+		}							\
+		spin_unlock_irq(&lock);					\
+		ret = schedule_timeout(ret);				\
+		spin_lock_irq(&lock);					\
+		if (!ret)						\
+			break;						\
+	}								\
+	finish_wait(&wq, &__wait);					\
+} while (0)
+
+/**
+ * wait_event_interruptible_lock_irq_timeout - sleep until a condition gets true or a timeout elapses.
+ *		The condition is checked under the lock. This is expected
+ *		to be called with the lock taken.
+ * @wq: the waitqueue to wait on
+ * @condition: a C expression for the event to wait for
+ * @lock: a locked spinlock_t, which will be released before schedule()
+ *	  and reacquired afterwards.
+ * @timeout: timeout, in jiffies
+ *
+ * The process is put to sleep (TASK_INTERRUPTIBLE) until the
+ * @condition evaluates to true or signal is received. The @condition is
+ * checked each time the waitqueue @wq is woken up.
+ *
+ * wake_up() has to be called after changing any variable that could
+ * change the result of the wait condition.
+ *
+ * This is supposed to be called while holding the lock. The lock is
+ * dropped before going to sleep and is reacquired afterwards.
+ *
+ * The function returns 0 if the @timeout elapsed, -ERESTARTSYS if it
+ * was interrupted by a signal, and the remaining jiffies otherwise
+ * if the condition evaluated to true before the timeout elapsed.
+ */
+#define wait_event_interruptible_lock_irq_timeout(wq, condition, lock,	\
+						  timeout)		\
+({									\
+	int __ret = timeout;						\
+									\
+	if (!(condition))						\
+		__wait_event_interruptible_lock_irq_timeout(		\
+					wq, condition, lock, __ret);	\
+	__ret;								\
+})
+
 
 /*
  * These are the old interfaces to sleep waiting for an event.



  parent reply	other threads:[~2013-08-27  1:13 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-27  1:07 [ 00/74] 3.10.10-stable review Greg Kroah-Hartman
2013-08-27  1:07 ` [ 01/74] KVM: s390: move kvm_guest_enter,exit closer to sie Greg Kroah-Hartman
2013-08-27  1:08 ` [ 02/74] mac80211: dont wait for TX status forever Greg Kroah-Hartman
2013-08-27  1:08 ` [ 03/74] ACPI: add _STA evaluation at do_acpi_find_child() Greg Kroah-Hartman
2013-08-27  1:08 ` [ 04/74] ACPI: Try harder to resolve _ADR collisions for bridges Greg Kroah-Hartman
2013-08-27  1:08 ` [ 05/74] ARC: gdbserver breakage in Big-Endian configuration #1 Greg Kroah-Hartman
2013-08-27  1:08 ` [ 06/74] ARC: gdbserver breakage in Big-Endian configuration #2 Greg Kroah-Hartman
2013-08-27  1:08 ` [ 07/74] ARM: at91: at91sam9x5 RTC is not compatible with at91rm9200 one Greg Kroah-Hartman
2013-08-27  1:08 ` [ 08/74] NFC: llcp: Fix non blocking sockets connections Greg Kroah-Hartman
2013-08-27  1:08 ` [ 09/74] iwlwifi: mvm: correctly configure MCAST in AP mode Greg Kroah-Hartman
2013-08-27  1:08 ` [ 10/74] iwlwifi: mvm: fix " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 11/74] iwlwifi: mvm: properly tell the fw that a STA is awake Greg Kroah-Hartman
2013-08-27  1:08 ` [ 12/74] iwlwifi: mvm: dont set the MCAST queue in STAs queue list Greg Kroah-Hartman
2013-08-27  1:08 ` [ 13/74] iwlwifi: mvm: take the seqno from packet if transmit failed Greg Kroah-Hartman
2013-08-27  1:08 ` [ 14/74] iwlwifi: mvm: unregister leds when registration failed Greg Kroah-Hartman
2013-08-27  1:08 ` [ 15/74] iwlwifi: bump required firmware API version for 3160/7260 Greg Kroah-Hartman
2013-08-27  1:08 ` [ 16/74] iwlwifi: mvm: adjust firmware D3 configuration API Greg Kroah-Hartman
2013-08-27  1:08 ` [ 17/74] tracing: Do not call kmem_cache_free() on allocation failure Greg Kroah-Hartman
2013-08-27  1:08 ` [ 18/74] tracing/kprobe: Wait for disabling all running kprobe handlers Greg Kroah-Hartman
2013-08-27  1:08 ` [ 19/74] tracing: Introduce trace_create_cpu_file() and tracing_get_cpu() Greg Kroah-Hartman
2013-08-27  1:08 ` [ 20/74] tracing: Change tracing_pipe_fops() to rely on tracing_get_cpu() Greg Kroah-Hartman
2013-08-27  1:08 ` [ 21/74] tracing: Change tracing_buffers_fops " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 22/74] tracing: Change tracing_stats_fops " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 23/74] tracing: Change tracing_entries_fops " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 24/74] tracing: Change tracing_fops/snapshot_fops " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 25/74] ftrace: Add check for NULL regs if ops has SAVE_REGS set Greg Kroah-Hartman
2013-08-27  1:08 ` [ 26/74] tracing: Turn event/id->i_private into call->event.type Greg Kroah-Hartman
2013-08-27  1:08 ` [ 27/74] tracing: Change event_enable/disable_read() to verify i_private != NULL Greg Kroah-Hartman
2013-08-27  1:08 ` [ 28/74] tracing: Change event_filter_read/write " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 29/74] tracing: Change f_start() to take event_mutex and " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 30/74] tracing: Introduce remove_event_file_dir() Greg Kroah-Hartman
2013-08-27  1:08 ` [ 31/74] tracing: Change remove_event_file_dir() to clear "d_subdirs"->i_private Greg Kroah-Hartman
2013-08-27  1:08 ` [ 32/74] tracing: trace_remove_event_call() should fail if call/file is in use Greg Kroah-Hartman
2013-08-27  1:08 ` [ 33/74] tracing/kprobes: Fail to unregister if probe event files are " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 34/74] tracing/uprobes: " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 35/74] ftrace: Check module functions being traced on reload Greg Kroah-Hartman
2013-08-27  1:08 ` [ 36/74] xen/smp: initialize IPI vectors before marking CPU online Greg Kroah-Hartman
2013-08-27  1:08 ` [ 37/74] ARC: [lib] strchr breakage in Big-endian configuration Greg Kroah-Hartman
2013-08-27  1:08 ` [ 38/74] zd1201: do not use stack as URB transfer_buffer Greg Kroah-Hartman
2013-08-27  1:08 ` [ 39/74] VFS: collect_mounts() should return an ERR_PTR Greg Kroah-Hartman
2013-08-27  1:08 ` [ 40/74] x86: Dont clear olpc_ofw_header when sentinel is detected Greg Kroah-Hartman
2013-08-27  1:08 ` [ 41/74] xen/events: initialize local per-cpu mask for all possible events Greg Kroah-Hartman
2013-08-27  1:08 ` [ 42/74] xen/events: mask events when changing their VCPU binding Greg Kroah-Hartman
2013-08-27  1:08 ` [ 43/74] ARM: davinci: nand: specify ecc strength Greg Kroah-Hartman
2013-08-27  1:08 ` [ 44/74] ARM: at91/DT: fix at91sam9n12ek memory node Greg Kroah-Hartman
2013-08-27  1:08 ` [ 45/74] arm64: perf: fix array out of bounds access in armpmu_map_hw_event() Greg Kroah-Hartman
2013-08-27  1:08 ` [ 46/74] arm64: perf: fix event validation for software group leaders Greg Kroah-Hartman
2013-08-27  1:08 ` [ 47/74] ARM: 7816/1: CONFIG_KUSER_HELPERS: fix help text Greg Kroah-Hartman
2013-08-27  1:08 ` [ 48/74] staging: comedi: bug-fix NULL pointer dereference on failed attach Greg Kroah-Hartman
2013-08-27  1:08 ` [ 49/74] drm/radeon/r7xx: fix copy paste typo in golden register setup Greg Kroah-Hartman
2013-08-27  1:08 ` [ 50/74] drm/radeon: fix UVD message buffer validation Greg Kroah-Hartman
2013-08-27  1:08 ` [ 51/74] drm/radeon: fix WREG32_OR macro setting bits in a register Greg Kroah-Hartman
2013-08-27  1:08 ` [ 52/74] drm/i915: Invalidate TLBs for the rings after a reset Greg Kroah-Hartman
2013-08-27  1:08 ` [ 53/74] of: fdt: fix memory initialization for expanded DT Greg Kroah-Hartman
2013-08-27  1:08 ` [ 54/74] nilfs2: remove double bio_put() in nilfs_end_bio_write() for BIO_EOPNOTSUPP error Greg Kroah-Hartman
2013-08-27  1:08 ` [ 55/74] nilfs2: fix issue with counting number of bio requests for BIO_EOPNOTSUPP error detection Greg Kroah-Hartman
2013-08-27  1:08 ` [ 56/74] drivers/platform/olpc/olpc-ec.c: initialise earlier Greg Kroah-Hartman
2013-08-27  1:08 ` [ 57/74] usb: phy: fix build breakage Greg Kroah-Hartman
2013-08-27  1:08 ` [ 58/74] sata_fsl: save irqs while coalescing Greg Kroah-Hartman
2013-08-27  1:08 ` [ 59/74] Hostap: copying wrong data prism2_ioctl_giwaplist() Greg Kroah-Hartman
2013-08-27  1:08 ` [ 60/74] libata: apply behavioral quirks to sil3826 PMP Greg Kroah-Hartman
2013-08-27  1:08 ` [ 61/74] iwlwifi: dvm: fix calling ieee80211_chswitch_done() with NULL Greg Kroah-Hartman
2013-08-27  1:09 ` [ 62/74] iwlwifi: pcie: disable L1 Active after pci_enable_device Greg Kroah-Hartman
2013-08-27  1:09 ` Greg Kroah-Hartman [this message]
2013-08-27  1:09 ` [ 64/74] SCSI: zfcp: fix schedule-inside-lock in scsi_device list loops Greg Kroah-Hartman
2013-08-27  1:09 ` [ 65/74] SCSI: lpfc: Dont force CONFIG_GENERIC_CSUM on Greg Kroah-Hartman
2013-08-27  1:09 ` [ 66/74] SCSI: sg: Fix user memory corruption when SG_IO is interrupted by a signal Greg Kroah-Hartman
2013-08-27  1:09 ` [ 67/74] Revert "x86 get_unmapped_area(): use proper mmap base for bottom-up direction" Greg Kroah-Hartman
2013-08-27  1:09 ` [ 68/74] x86 get_unmapped_area: Access mmap_legacy_base through mm_struct member Greg Kroah-Hartman
2013-08-27  1:09 ` [ 69/74] x86/xen: do not identity map UNUSABLE regions in the machine E820 Greg Kroah-Hartman
2013-08-27  1:09 ` [ 70/74] mei: me: fix reset state machine Greg Kroah-Hartman
2013-08-27  1:09 ` [ 71/74] mei: dont have to clean the state on power up Greg Kroah-Hartman
2013-08-27  1:09 ` [ 72/74] mei: me: fix waiting for hw ready Greg Kroah-Hartman
2013-08-27  1:09 ` [ 73/74] md: bcache: io.c: fix a potential NULL pointer dereference Greg Kroah-Hartman
2013-08-27  1:09 ` [ 74/74] bcache: FUA fixes Greg Kroah-Hartman
2013-08-27  4:31 ` [ 00/74] 3.10.10-stable review Guenter Roeck
2013-08-27 22:27   ` Greg Kroah-Hartman
2013-08-27 20:41 ` Shuah Khan
2013-08-27 22:27   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130827010437.390785185@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=JBottomley@Parallels.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maier@linux.vnet.ibm.com \
    --cc=mpatocka@redhat.com \
    --cc=mpeschke@linux.vnet.ibm.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.