All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Gary Guo <ghg@datera.io>,
	Mike Christie <mchristi@redhat.com>,
	Hannes Reinecke <hare@suse.de>,
	Nicholas Bellinger <nab@linux-iscsi.org>
Subject: [PATCH 3.18 03/38] iscsi-target: Fix iscsi_np reset hung task during parallel delete
Date: Sun, 19 Nov 2017 15:29:19 +0100	[thread overview]
Message-ID: <20171119142922.037040189@linuxfoundation.org> (raw)
In-Reply-To: <20171119142921.807414664@linuxfoundation.org>

3.18-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nicholas Bellinger <nab@linux-iscsi.org>

commit 978d13d60c34818a41fc35962602bdfa5c03f214 upstream.

This patch fixes a bug associated with iscsit_reset_np_thread()
that can occur during parallel configfs rmdir of a single iscsi_np
used across multiple iscsi-target instances, that would result in
hung task(s) similar to below where configfs rmdir process context
was blocked indefinately waiting for iscsi_np->np_restart_comp
to finish:

[ 6726.112076] INFO: task dcp_proxy_node_:15550 blocked for more than 120 seconds.
[ 6726.119440]       Tainted: G        W  O     4.1.26-3321 #2
[ 6726.125045] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6726.132927] dcp_proxy_node_ D ffff8803f202bc88     0 15550      1 0x00000000
[ 6726.140058]  ffff8803f202bc88 ffff88085c64d960 ffff88083b3b1ad0 ffff88087fffeb08
[ 6726.147593]  ffff8803f202c000 7fffffffffffffff ffff88083f459c28 ffff88083b3b1ad0
[ 6726.155132]  ffff88035373c100 ffff8803f202bca8 ffffffff8168ced2 ffff8803f202bcb8
[ 6726.162667] Call Trace:
[ 6726.165150]  [<ffffffff8168ced2>] schedule+0x32/0x80
[ 6726.170156]  [<ffffffff8168f5b4>] schedule_timeout+0x214/0x290
[ 6726.176030]  [<ffffffff810caef2>] ? __send_signal+0x52/0x4a0
[ 6726.181728]  [<ffffffff8168d7d6>] wait_for_completion+0x96/0x100
[ 6726.187774]  [<ffffffff810e7c80>] ? wake_up_state+0x10/0x10
[ 6726.193395]  [<ffffffffa035d6e2>] iscsit_reset_np_thread+0x62/0xe0 [iscsi_target_mod]
[ 6726.201278]  [<ffffffffa0355d86>] iscsit_tpg_disable_portal_group+0x96/0x190 [iscsi_target_mod]
[ 6726.210033]  [<ffffffffa0363f7f>] lio_target_tpg_store_enable+0x4f/0xc0 [iscsi_target_mod]
[ 6726.218351]  [<ffffffff81260c5a>] configfs_write_file+0xaa/0x110
[ 6726.224392]  [<ffffffff811ea364>] vfs_write+0xa4/0x1b0
[ 6726.229576]  [<ffffffff811eb111>] SyS_write+0x41/0xb0
[ 6726.234659]  [<ffffffff8169042e>] system_call_fastpath+0x12/0x71

It would happen because each iscsit_reset_np_thread() sets state
to ISCSI_NP_THREAD_RESET, sends SIGINT, and then blocks waiting
for completion on iscsi_np->np_restart_comp.

However, if iscsi_np was active processing a login request and
more than a single iscsit_reset_np_thread() caller to the same
iscsi_np was blocked on iscsi_np->np_restart_comp, iscsi_np
kthread process context in __iscsi_target_login_thread() would
flush pending signals and only perform a single completion of
np->np_restart_comp before going back to sleep within transport
specific iscsit_transport->iscsi_accept_np code.

To address this bug, add a iscsi_np->np_reset_count and update
__iscsi_target_login_thread() to keep completing np->np_restart_comp
until ->np_reset_count has reached zero.

Reported-by: Gary Guo <ghg@datera.io>
Tested-by: Gary Guo <ghg@datera.io>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


---
 drivers/target/iscsi/iscsi_target.c       |    1 +
 drivers/target/iscsi/iscsi_target_core.h  |    1 +
 drivers/target/iscsi/iscsi_target_login.c |    7 +++++--
 include/target/iscsi/iscsi_target_core.h  |    1 +
 4 files changed, 8 insertions(+), 2 deletions(-)

--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -428,6 +428,7 @@ int iscsit_reset_np_thread(
 		return 0;
 	}
 	np->np_thread_state = ISCSI_NP_THREAD_RESET;
+	atomic_inc(&np->np_reset_count);
 
 	if (np->np_thread) {
 		spin_unlock_bh(&np->np_thread_lock);
--- a/drivers/target/iscsi/iscsi_target_core.h
+++ b/drivers/target/iscsi/iscsi_target_core.h
@@ -783,6 +783,7 @@ struct iscsi_np {
 	int			np_sock_type;
 	enum np_thread_state_table np_thread_state;
 	bool                    enabled;
+	atomic_t		np_reset_count;
 	enum iscsi_timer_flags_table np_login_timer_flags;
 	u32			np_exports;
 	enum np_flags_table	np_flags;
--- a/drivers/target/iscsi/iscsi_target_login.c
+++ b/drivers/target/iscsi/iscsi_target_login.c
@@ -1275,9 +1275,11 @@ static int __iscsi_target_login_thread(s
 	flush_signals(current);
 
 	spin_lock_bh(&np->np_thread_lock);
-	if (np->np_thread_state == ISCSI_NP_THREAD_RESET) {
+	if (atomic_dec_if_positive(&np->np_reset_count) >= 0) {
 		np->np_thread_state = ISCSI_NP_THREAD_ACTIVE;
+		spin_unlock_bh(&np->np_thread_lock);
 		complete(&np->np_restart_comp);
+		return 1;
 	} else if (np->np_thread_state == ISCSI_NP_THREAD_SHUTDOWN) {
 		spin_unlock_bh(&np->np_thread_lock);
 		goto exit;
@@ -1310,7 +1312,8 @@ static int __iscsi_target_login_thread(s
 		goto exit;
 	} else if (rc < 0) {
 		spin_lock_bh(&np->np_thread_lock);
-		if (np->np_thread_state == ISCSI_NP_THREAD_RESET) {
+		if (atomic_dec_if_positive(&np->np_reset_count) >= 0) {
+			np->np_thread_state = ISCSI_NP_THREAD_ACTIVE;
 			spin_unlock_bh(&np->np_thread_lock);
 			complete(&np->np_restart_comp);
 			iscsit_put_transport(conn->conn_transport);
--- a/include/target/iscsi/iscsi_target_core.h
+++ b/include/target/iscsi/iscsi_target_core.h
@@ -784,6 +784,7 @@ struct iscsi_np {
 	int			np_sock_type;
 	enum np_thread_state_table np_thread_state;
 	bool                    enabled;
+	atomic_t		np_reset_count;
 	enum iscsi_timer_flags_table np_login_timer_flags;
 	u32			np_exports;
 	enum np_flags_table	np_flags;

  parent reply	other threads:[~2017-11-19 14:30 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-19 14:29 [PATCH 3.18 00/38] 3.18.83-stable review Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 01/38] media: imon: Fix null-ptr-deref in imon_probe Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 02/38] media: dib0700: fix invalid dvb_detach argument Greg Kroah-Hartman
2017-11-19 14:29 ` Greg Kroah-Hartman [this message]
2017-11-19 14:29 ` [PATCH 3.18 04/38] extcon: palmas: Check the parent instance to prevent the NULL Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 05/38] ARM: OMAP2+: Fix init for multiple quirks for the same SoC Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 06/38] ARM: dts: Fix omap3 off mode pull defines Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 07/38] ata: ATA_BMDMA should depend on HAS_DMA Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 08/38] ata: SATA_HIGHBANK " Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 09/38] ata: SATA_MV " Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 10/38] drm/sti: sti_vtg: Handle return NULL error from devm_ioremap_nocache Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 11/38] igb: reset the PHY before reading the PHY ID Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 12/38] igb: close/suspend race in netif_device_detach Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 13/38] igb: Fix hw_dbg logging in igb_update_flash_i210 Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 14/38] staging: rtl8188eu: fix incorrect ERROR tags from logs Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 15/38] scsi: lpfc: Add missing memory barrier Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 16/38] scsi: lpfc: FCoE VPort enable-disable does not bring up the VPort Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 17/38] scsi: lpfc: Correct host name in symbolic_name field Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 18/38] scsi: lpfc: Correct issue leading to oops during link reset Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 19/38] ALSA: vx: Dont try to update capture stream before running Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 20/38] ALSA: vx: Fix possible transfer overflow Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 22/38] backlight: adp5520: Fix error handling in adp5520_bl_probe() Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 23/38] gpu: drm: mgag200: mgag200_main:- Handle error from pci_iomap Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 24/38] ixgbe: fix AER error handling Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 25/38] ixgbe: handle close/suspend race with netif_device_detach/present Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 26/38] MIPS: End asm function prologue macros with .insn Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 27/38] MIPS: init: Ensure reserved memory regions are not added to bootmem Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 28/38] MIPS: Netlogic: Exclude netlogic,xlp-pic code from XLR builds Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 29/38] Revert "crypto: xts - Add ECB dependency" Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 30/38] Revert "uapi: fix linux/rds.h userspace compilation errors" Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 31/38] uapi: fix linux/rds.h userspace compilation error Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 32/38] uapi: fix linux/rds.h userspace compilation errors Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 33/38] USB: usbfs: compute urb->actual_length for isochronous Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 34/38] USB: Add delay-init quirk for Corsair K70 LUX keyboards Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 35/38] USB: serial: qcserial: add pid/vid for Sierra Wireless EM7355 fw update Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 36/38] USB: serial: garmin_gps: fix memory leak on failed URB submit Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 37/38] USB: serial: garmin_gps: fix I/O after failed probe and remove Greg Kroah-Hartman
2017-11-19 14:29 ` [PATCH 3.18 38/38] USB: serial: garmin_gps: fix memory leak on probe errors Greg Kroah-Hartman
2017-11-20 14:06 ` [PATCH 3.18 00/38] 3.18.83-stable review Guenter Roeck
2017-11-20 14:48   ` Greg Kroah-Hartman
2017-11-20 15:41     ` Guenter Roeck
2017-11-20 16:25     ` Guenter Roeck
2017-11-20 19:17       ` Greg Kroah-Hartman
2017-11-20 21:15 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171119142922.037040189@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=ghg@datera.io \
    --cc=hare@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchristi@redhat.com \
    --cc=nab@linux-iscsi.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.