stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Michal Hocko <mhocko@suse.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Hannes Reinecke <hare@suse.de>,
	James Bottomley <JBottomley@Odin.com>
Subject: [PATCH 3.10 01/54] scsi: fix scsi_error_handler vs. scsi_host_dev_release race
Date: Sat, 17 Oct 2015 19:05:05 -0700	[thread overview]
Message-ID: <20151018020314.124159016@linuxfoundation.org> (raw)
In-Reply-To: <20151018020314.063429128@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michal Hocko <mhocko@suse.com>

commit 537b604c8b3aa8b96fe35f87dd085816552e294c upstream.

b9d5c6b7ef57 ("[SCSI] cleanup setting task state in
scsi_error_handler()") has introduced a race between scsi_error_handler
and scsi_host_dev_release resulting in the hang when the device goes
away because scsi_error_handler might miss a wake up:

CPU0					CPU1
scsi_error_handler			scsi_host_dev_release
  					  kthread_stop()
  kthread_should_stop()
    test_bit(KTHREAD_SHOULD_STOP)
					    set_bit(KTHREAD_SHOULD_STOP)
					    wake_up_process()
					    wait_for_completion()

  set_current_state(TASK_INTERRUPTIBLE)
  schedule()

The most straightforward solution seems to be to invert the ordering of
the set_current_state and kthread_should_stop.

The issue has been noticed during reboot test on a 3.0 based kernel but
the current code seems to be affected in the same way.

[jejb: additional comment added]
Reported-and-debugged-by: Mike Mayer <Mike.Meyer@teradata.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/scsi/scsi_error.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1849,8 +1849,17 @@ int scsi_error_handler(void *data)
 	 * We never actually get interrupted because kthread_run
 	 * disables signal delivery for the created thread.
 	 */
-	while (!kthread_should_stop()) {
+	while (true) {
+		/*
+		 * The sequence in kthread_stop() sets the stop flag first
+		 * then wakes the process.  To avoid missed wakeups, the task
+		 * should always be in a non running state before the stop
+		 * flag is checked
+		 */
 		set_current_state(TASK_INTERRUPTIBLE);
+		if (kthread_should_stop())
+			break;
+
 		if ((shost->host_failed == 0 && shost->host_eh_scheduled == 0) ||
 		    shost->host_failed != shost->host_busy) {
 			SCSI_LOG_ERROR_RECOVERY(1,



  reply	other threads:[~2015-10-18  2:51 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-18  2:05 [PATCH 3.10 00/54] 3.10.91-stable review Greg Kroah-Hartman
2015-10-18  2:05 ` Greg Kroah-Hartman [this message]
2015-10-18  2:05 ` [PATCH 3.10 02/54] perf header: Fixup reading of HEADER_NRCPUS feature Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 03/54] ARM: 8429/1: disable GCC SRA optimization Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 04/54] windfarm: decrement client count when unregistering Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 05/54] x86/apic: Serialize LVTT and TSC_DEADLINE writes Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 06/54] x86/platform: Fix Geode LX timekeeping in the generic x86 build Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 07/54] Use WARN_ON_ONCE for missing X86_FEATURE_NRIPS Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 08/54] x86/mm: Set NX on gap between __ex_table and rodata Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 09/54] x86/xen: Support kexec/kdump in HVM guests by doing a soft reset Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 10/54] spi: Fix documentation of spi_alloc_master() Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 11/54] spi: spi-pxa2xx: Check status register to determine if SSSR_TINT is disabled Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 12/54] mm: hugetlbfs: skip shared VMAs when unmapping private pages to satisfy a fault Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 13/54] ALSA: synth: Fix conflicting OSS device registration on AWE32 Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 14/54] ASoC: fix broken pxa SoC support Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 15/54] ASoC: dwc: correct irq clear method Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 16/54] btrfs: skip waiting on ordered range for special files Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 17/54] staging: comedi: adl_pci7x3x: fix digital output on PCI-7230 Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 18/54] dm btree: add ref counting ops for the leaves of top level btrees Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 19/54] USB: option: add ZTE PIDs Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 20/54] dm raid: fix round up of default region size Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 21/54] netfilter: nf_conntrack: Support expectations in different zones Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 22/54] disabling oplocks/leases via module parm enable_oplocks broken for SMB3 Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 23/54] drm: Reject DRI1 hw lock ioctl functions for kms drivers Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 24/54] USB: whiteheat: fix potential null-deref at probe Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 25/54] usb: xhci: Clear XHCI_STATE_DYING on start Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 26/54] xhci: change xhci 1.0 only restrictions to support xhci 1.1 Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 27/54] usb: xhci: Add support for URB_ZERO_PACKET to bulk/sg transfers Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 28/54] Initialize msg/shm IPC objects before doing ipc_addid() Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 29/54] ipvs: do not use random local source address for tunnels Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 30/54] ipvs: fix crash with sync protocol v0 and FTP Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 31/54] udf: Check length of extended attributes and allocation descriptors Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 32/54] regmap: debugfs: Ensure we dont underflow when printing access masks Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 33/54] regmap: debugfs: Dont bother actually printing when calculating max length Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 34/54] security: fix typo in security_task_prctl Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 35/54] usb: Use the USB_SS_MULT() macro to get the burst multiplier Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 36/54] usb: Add device quirk for Logitech PTZ cameras Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 37/54] USB: Add reset-resume quirk for two Plantronics usb headphones Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 38/54] MIPS: dma-default: Fix 32-bit fall back to GFP_DMA Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 39/54] md: flush ->event_work before stopping array Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 40/54] powerpc/MSI: Fix race condition in tearing down MSI interrupts Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 41/54] UBI: Validate data_size Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 42/54] UBI: return ENOSPC if no enough space available Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 43/54] IB/qib: Change lkey table allocation to support more MRs Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 44/54] dcache: Handle escaped paths in prepend_path Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 45/54] vfs: Test for and handle paths that are unreachable from their mnt_root Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 46/54] arm64: readahead: fault retry breaks mmap file read random detection Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 47/54] m68k: Define asmlinkage_protect Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 48/54] bonding: correct the MAC address for "follow" fail_over_mac policy Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 49/54] fib_rules: Fix dump_rules() not to exit early Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 50/54] genirq: Fix race in register_irq_proc() Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 51/54] x86: Add 1/2/4/8 byte optimization to 64bit __copy_{from,to}_user_inatomic Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 52/54] dm cache: fix NULL pointer when switching from cleaner policy Greg Kroah-Hartman
2015-10-18  2:05 ` [PATCH 3.10 53/54] staging: speakup: fix speakup-r regression Greg Kroah-Hartman
2015-10-18  7:05 ` [PATCH 3.10 00/54] 3.10.91-stable review Willy Tarreau
2015-10-18 16:05   ` Greg Kroah-Hartman
2015-10-18 19:17     ` Willy Tarreau
2015-10-18 19:38       ` Greg Kroah-Hartman
2015-10-19  4:05 ` Guenter Roeck
2015-10-19 15:14   ` Greg Kroah-Hartman
2015-10-19 15:19 ` Shuah Khan
2015-10-22 21:35   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151018020314.124159016@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=JBottomley@Odin.com \
    --cc=dan.j.williams@intel.com \
    --cc=hare@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).