From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Rob Millner <rlm@daterainc.com>,
Vaibhav Tandon <vst@datera.io>,
"Bryant G. Ly" <bryantly@linux.vnet.ibm.com>,
Nicholas Bellinger <nab@linux-iscsi.org>
Subject: [PATCH 4.4 20/36] target: Fix NULL dereference during LUN lookup + active I/O shutdown
Date: Mon, 13 Mar 2017 16:39:20 +0800 [thread overview]
Message-ID: <20170313083353.746407930@linuxfoundation.org> (raw)
In-Reply-To: <20170313083352.550085638@linuxfoundation.org>
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Bellinger <nab@linux-iscsi.org>
commit bd4e2d2907fa23a11d46217064ecf80470ddae10 upstream.
When transport_clear_lun_ref() is shutting down a se_lun via
configfs with new I/O in-flight, it's possible to trigger a
NULL pointer dereference in transport_lookup_cmd_lun() due
to the fact percpu_ref_get() doesn't do any __PERCPU_REF_DEAD
checking before incrementing lun->lun_ref.count after
lun->lun_ref has switched to atomic_t mode.
This results in a NULL pointer dereference as LUN shutdown
code in core_tpg_remove_lun() continues running after the
existing ->release() -> core_tpg_lun_ref_release() callback
completes, and clears the RCU protected se_lun->lun_se_dev
pointer.
During the OOPs, the state of lun->lun_ref in the process
which triggered the NULL pointer dereference looks like
the following on v4.1.y stable code:
struct se_lun {
lun_link_magic = 4294932337,
lun_status = TRANSPORT_LUN_STATUS_FREE,
.....
lun_se_dev = 0x0,
lun_sep = 0x0,
.....
lun_ref = {
count = {
counter = 1
},
percpu_count_ptr = 3,
release = 0xffffffffa02fa1e0 <core_tpg_lun_ref_release>,
confirm_switch = 0x0,
force_atomic = false,
rcu = {
next = 0xffff88154fa1a5d0,
func = 0xffffffff8137c4c0 <percpu_ref_switch_to_atomic_rcu>
}
}
}
To address this bug, use percpu_ref_tryget_live() to ensure
once __PERCPU_REF_DEAD is visable on all CPUs and ->lun_ref
has switched to atomic_t, all new I/Os will fail to obtain
a new lun->lun_ref reference.
Also use an explicit percpu_ref_kill_and_confirm() callback
to block on ->lun_ref_comp to allow the first stage and
associated RCU grace period to complete, and then block on
->lun_ref_shutdown waiting for the final percpu_ref_put()
to drop the last reference via transport_lun_remove_cmd()
before continuing with core_tpg_remove_lun() shutdown.
Reported-by: Rob Millner <rlm@daterainc.com>
Tested-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Tested-by: Vaibhav Tandon <vst@datera.io>
Cc: Vaibhav Tandon <vst@datera.io>
Tested-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/target/target_core_device.c | 10 ++++++++--
drivers/target/target_core_tpg.c | 3 ++-
drivers/target/target_core_transport.c | 31 ++++++++++++++++++++++++++++++-
include/target/target_core_base.h | 1 +
4 files changed, 41 insertions(+), 4 deletions(-)
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -77,12 +77,16 @@ transport_lookup_cmd_lun(struct se_cmd *
&deve->read_bytes);
se_lun = rcu_dereference(deve->se_lun);
+
+ if (!percpu_ref_tryget_live(&se_lun->lun_ref)) {
+ se_lun = NULL;
+ goto out_unlock;
+ }
+
se_cmd->se_lun = rcu_dereference(deve->se_lun);
se_cmd->pr_res_key = deve->pr_res_key;
se_cmd->orig_fe_lun = unpacked_lun;
se_cmd->se_cmd_flags |= SCF_SE_LUN_CMD;
-
- percpu_ref_get(&se_lun->lun_ref);
se_cmd->lun_ref_active = true;
if ((se_cmd->data_direction == DMA_TO_DEVICE) &&
@@ -96,6 +100,7 @@ transport_lookup_cmd_lun(struct se_cmd *
goto ref_dev;
}
}
+out_unlock:
rcu_read_unlock();
if (!se_lun) {
@@ -826,6 +831,7 @@ struct se_device *target_alloc_device(st
xcopy_lun = &dev->xcopy_lun;
rcu_assign_pointer(xcopy_lun->lun_se_dev, dev);
init_completion(&xcopy_lun->lun_ref_comp);
+ init_completion(&xcopy_lun->lun_shutdown_comp);
INIT_LIST_HEAD(&xcopy_lun->lun_deve_list);
INIT_LIST_HEAD(&xcopy_lun->lun_dev_link);
mutex_init(&xcopy_lun->lun_tg_pt_md_mutex);
--- a/drivers/target/target_core_tpg.c
+++ b/drivers/target/target_core_tpg.c
@@ -539,7 +539,7 @@ static void core_tpg_lun_ref_release(str
{
struct se_lun *lun = container_of(ref, struct se_lun, lun_ref);
- complete(&lun->lun_ref_comp);
+ complete(&lun->lun_shutdown_comp);
}
int core_tpg_register(
@@ -666,6 +666,7 @@ struct se_lun *core_tpg_alloc_lun(
lun->lun_link_magic = SE_LUN_LINK_MAGIC;
atomic_set(&lun->lun_acl_count, 0);
init_completion(&lun->lun_ref_comp);
+ init_completion(&lun->lun_shutdown_comp);
INIT_LIST_HEAD(&lun->lun_deve_list);
INIT_LIST_HEAD(&lun->lun_dev_link);
atomic_set(&lun->lun_tg_pt_secondary_offline, 0);
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -2680,10 +2680,39 @@ void target_wait_for_sess_cmds(struct se
}
EXPORT_SYMBOL(target_wait_for_sess_cmds);
+static void target_lun_confirm(struct percpu_ref *ref)
+{
+ struct se_lun *lun = container_of(ref, struct se_lun, lun_ref);
+
+ complete(&lun->lun_ref_comp);
+}
+
void transport_clear_lun_ref(struct se_lun *lun)
{
- percpu_ref_kill(&lun->lun_ref);
+ /*
+ * Mark the percpu-ref as DEAD, switch to atomic_t mode, drop
+ * the initial reference and schedule confirm kill to be
+ * executed after one full RCU grace period has completed.
+ */
+ percpu_ref_kill_and_confirm(&lun->lun_ref, target_lun_confirm);
+ /*
+ * The first completion waits for percpu_ref_switch_to_atomic_rcu()
+ * to call target_lun_confirm after lun->lun_ref has been marked
+ * as __PERCPU_REF_DEAD on all CPUs, and switches to atomic_t
+ * mode so that percpu_ref_tryget_live() lookup of lun->lun_ref
+ * fails for all new incoming I/O.
+ */
wait_for_completion(&lun->lun_ref_comp);
+ /*
+ * The second completion waits for percpu_ref_put_many() to
+ * invoke ->release() after lun->lun_ref has switched to
+ * atomic_t mode, and lun->lun_ref.count has reached zero.
+ *
+ * At this point all target-core lun->lun_ref references have
+ * been dropped via transport_lun_remove_cmd(), and it's safe
+ * to proceed with the remaining LUN shutdown.
+ */
+ wait_for_completion(&lun->lun_shutdown_comp);
}
static bool
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -740,6 +740,7 @@ struct se_lun {
struct config_group lun_group;
struct se_port_stat_grps port_stat_grps;
struct completion lun_ref_comp;
+ struct completion lun_shutdown_comp;
struct percpu_ref lun_ref;
struct list_head lun_dev_link;
struct hlist_node link;
next prev parent reply other threads:[~2017-03-13 9:22 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-13 8:39 [PATCH 4.4 00/36] 4.4.54-stable review Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 01/36] TTY: n_hdlc, fix lockdep false positive Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 02/36] tty: n_hdlc: get rid of racy n_hdlc.tbuf Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 03/36] serial: 8250_pci: Add MKS Tenta SCOM-0800 and SCOM-0801 cards Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 04/36] KVM: s390: Disable dirty log retrieval for UCONTROL guests Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 05/36] KVM: VMX: use correct vmcs_read/write for guest segment selector/base Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 06/36] Bluetooth: Add another AR3012 04ca:3018 device Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 07/36] s390/qdio: clear DSCI prior to scanning multiple input queues Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 08/36] s390/dcssblk: fix device size calculation in dcssblk_direct_access() Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 09/36] s390: TASK_SIZE for kernel threads Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 10/36] s390: make setup_randomness work Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 11/36] s390: use correct input data address for setup_randomness Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 12/36] net: mvpp2: fix DMA address calculation in mvpp2_txq_inc_put() Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 13/36] mnt: Tuck mounts under others instead of creating shadow/side mounts Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 14/36] IB/ipoib: Fix deadlock between rmmod and set_mode Greg Kroah-Hartman
2017-03-17 2:24 ` Ben Hutchings
2017-03-13 8:39 ` [PATCH 4.4 15/36] IB/IPoIB: Add destination address when re-queue packet Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 16/36] IB/srp: Avoid that duplicate responses trigger a kernel bug Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 17/36] IB/srp: Fix race conditions related to task management Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 18/36] ktest: Fix child exit code processing Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 19/36] ceph: remove req from unsafe list when unregistering it Greg Kroah-Hartman
2017-03-13 8:39 ` Greg Kroah-Hartman [this message]
2017-03-13 8:39 ` [PATCH 4.4 21/36] nlm: Ensure callback code also checks that the files match Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 22/36] pwm: pca9685: Fix period change with same duty cycle Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 23/36] xtensa: move parse_tag_fdt out of #ifdef CONFIG_BLK_DEV_INITRD Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 24/36] mac80211: flush delayed work when entering suspend Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 26/36] drm/ast: Fix test for VGA enabled Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 27/36] drm/ast: Call open_key before enable_mmio in POST code Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 28/36] drm/ast: Fix AST2400 POST failure without BMC FW or VBIOS Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 31/36] drm/atomic: fix an error code in mode_fixup() Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 32/36] fakelb: fix schedule while atomic Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 34/36] libceph: use BUG() instead of BUG_ON(1) Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 35/36] fat: fix using uninitialized fields of fat_inode/fsinfo_inode Greg Kroah-Hartman
2017-03-13 8:39 ` [PATCH 4.4 36/36] drivers: hv: Turn off write permission on the hypercall page Greg Kroah-Hartman
2017-03-13 22:36 ` [PATCH 4.4 00/36] 4.4.54-stable review Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170313083353.746407930@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=bryantly@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nab@linux-iscsi.org \
--cc=rlm@daterainc.com \
--cc=stable@vger.kernel.org \
--cc=vst@datera.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.