From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Rob Millner <rlm@daterainc.com>,
Vaibhav Tandon <vst@datera.io>,
"Bryant G. Ly" <bryantly@linux.vnet.ibm.com>,
Nicholas Bellinger <nab@linux-iscsi.org>
Subject: [PATCH 4.10 36/75] target: Fix NULL dereference during LUN lookup + active I/O shutdown
Date: Mon, 13 Mar 2017 16:43:45 +0800 [thread overview]
Message-ID: <20170313083413.468940433@linuxfoundation.org> (raw)
In-Reply-To: <20170313083411.408297387@linuxfoundation.org>
4.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Bellinger <nab@linux-iscsi.org>
commit bd4e2d2907fa23a11d46217064ecf80470ddae10 upstream.
When transport_clear_lun_ref() is shutting down a se_lun via
configfs with new I/O in-flight, it's possible to trigger a
NULL pointer dereference in transport_lookup_cmd_lun() due
to the fact percpu_ref_get() doesn't do any __PERCPU_REF_DEAD
checking before incrementing lun->lun_ref.count after
lun->lun_ref has switched to atomic_t mode.
This results in a NULL pointer dereference as LUN shutdown
code in core_tpg_remove_lun() continues running after the
existing ->release() -> core_tpg_lun_ref_release() callback
completes, and clears the RCU protected se_lun->lun_se_dev
pointer.
During the OOPs, the state of lun->lun_ref in the process
which triggered the NULL pointer dereference looks like
the following on v4.1.y stable code:
struct se_lun {
lun_link_magic = 4294932337,
lun_status = TRANSPORT_LUN_STATUS_FREE,
.....
lun_se_dev = 0x0,
lun_sep = 0x0,
.....
lun_ref = {
count = {
counter = 1
},
percpu_count_ptr = 3,
release = 0xffffffffa02fa1e0 <core_tpg_lun_ref_release>,
confirm_switch = 0x0,
force_atomic = false,
rcu = {
next = 0xffff88154fa1a5d0,
func = 0xffffffff8137c4c0 <percpu_ref_switch_to_atomic_rcu>
}
}
}
To address this bug, use percpu_ref_tryget_live() to ensure
once __PERCPU_REF_DEAD is visable on all CPUs and ->lun_ref
has switched to atomic_t, all new I/Os will fail to obtain
a new lun->lun_ref reference.
Also use an explicit percpu_ref_kill_and_confirm() callback
to block on ->lun_ref_comp to allow the first stage and
associated RCU grace period to complete, and then block on
->lun_ref_shutdown waiting for the final percpu_ref_put()
to drop the last reference via transport_lun_remove_cmd()
before continuing with core_tpg_remove_lun() shutdown.
Reported-by: Rob Millner <rlm@daterainc.com>
Tested-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Tested-by: Vaibhav Tandon <vst@datera.io>
Cc: Vaibhav Tandon <vst@datera.io>
Tested-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/target/target_core_device.c | 10 ++++++++--
drivers/target/target_core_tpg.c | 3 ++-
drivers/target/target_core_transport.c | 31 ++++++++++++++++++++++++++++++-
include/target/target_core_base.h | 1 +
4 files changed, 41 insertions(+), 4 deletions(-)
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -78,12 +78,16 @@ transport_lookup_cmd_lun(struct se_cmd *
&deve->read_bytes);
se_lun = rcu_dereference(deve->se_lun);
+
+ if (!percpu_ref_tryget_live(&se_lun->lun_ref)) {
+ se_lun = NULL;
+ goto out_unlock;
+ }
+
se_cmd->se_lun = rcu_dereference(deve->se_lun);
se_cmd->pr_res_key = deve->pr_res_key;
se_cmd->orig_fe_lun = unpacked_lun;
se_cmd->se_cmd_flags |= SCF_SE_LUN_CMD;
-
- percpu_ref_get(&se_lun->lun_ref);
se_cmd->lun_ref_active = true;
if ((se_cmd->data_direction == DMA_TO_DEVICE) &&
@@ -97,6 +101,7 @@ transport_lookup_cmd_lun(struct se_cmd *
goto ref_dev;
}
}
+out_unlock:
rcu_read_unlock();
if (!se_lun) {
@@ -816,6 +821,7 @@ struct se_device *target_alloc_device(st
xcopy_lun = &dev->xcopy_lun;
rcu_assign_pointer(xcopy_lun->lun_se_dev, dev);
init_completion(&xcopy_lun->lun_ref_comp);
+ init_completion(&xcopy_lun->lun_shutdown_comp);
INIT_LIST_HEAD(&xcopy_lun->lun_deve_list);
INIT_LIST_HEAD(&xcopy_lun->lun_dev_link);
mutex_init(&xcopy_lun->lun_tg_pt_md_mutex);
--- a/drivers/target/target_core_tpg.c
+++ b/drivers/target/target_core_tpg.c
@@ -445,7 +445,7 @@ static void core_tpg_lun_ref_release(str
{
struct se_lun *lun = container_of(ref, struct se_lun, lun_ref);
- complete(&lun->lun_ref_comp);
+ complete(&lun->lun_shutdown_comp);
}
int core_tpg_register(
@@ -571,6 +571,7 @@ struct se_lun *core_tpg_alloc_lun(
lun->lun_link_magic = SE_LUN_LINK_MAGIC;
atomic_set(&lun->lun_acl_count, 0);
init_completion(&lun->lun_ref_comp);
+ init_completion(&lun->lun_shutdown_comp);
INIT_LIST_HEAD(&lun->lun_deve_list);
INIT_LIST_HEAD(&lun->lun_dev_link);
atomic_set(&lun->lun_tg_pt_secondary_offline, 0);
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -2706,10 +2706,39 @@ void target_wait_for_sess_cmds(struct se
}
EXPORT_SYMBOL(target_wait_for_sess_cmds);
+static void target_lun_confirm(struct percpu_ref *ref)
+{
+ struct se_lun *lun = container_of(ref, struct se_lun, lun_ref);
+
+ complete(&lun->lun_ref_comp);
+}
+
void transport_clear_lun_ref(struct se_lun *lun)
{
- percpu_ref_kill(&lun->lun_ref);
+ /*
+ * Mark the percpu-ref as DEAD, switch to atomic_t mode, drop
+ * the initial reference and schedule confirm kill to be
+ * executed after one full RCU grace period has completed.
+ */
+ percpu_ref_kill_and_confirm(&lun->lun_ref, target_lun_confirm);
+ /*
+ * The first completion waits for percpu_ref_switch_to_atomic_rcu()
+ * to call target_lun_confirm after lun->lun_ref has been marked
+ * as __PERCPU_REF_DEAD on all CPUs, and switches to atomic_t
+ * mode so that percpu_ref_tryget_live() lookup of lun->lun_ref
+ * fails for all new incoming I/O.
+ */
wait_for_completion(&lun->lun_ref_comp);
+ /*
+ * The second completion waits for percpu_ref_put_many() to
+ * invoke ->release() after lun->lun_ref has switched to
+ * atomic_t mode, and lun->lun_ref.count has reached zero.
+ *
+ * At this point all target-core lun->lun_ref references have
+ * been dropped via transport_lun_remove_cmd(), and it's safe
+ * to proceed with the remaining LUN shutdown.
+ */
+ wait_for_completion(&lun->lun_shutdown_comp);
}
static bool
--- a/include/target/target_core_base.h
+++ b/include/target/target_core_base.h
@@ -732,6 +732,7 @@ struct se_lun {
struct config_group lun_group;
struct se_port_stat_grps port_stat_grps;
struct completion lun_ref_comp;
+ struct completion lun_shutdown_comp;
struct percpu_ref lun_ref;
struct list_head lun_dev_link;
struct hlist_node link;
next prev parent reply other threads:[~2017-03-13 8:56 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-13 8:43 [PATCH 4.10 00/75] 4.10.3-stable review Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 01/75] tty: n_hdlc: get rid of racy n_hdlc.tbuf Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 02/75] serial: 8250_pci: Add MKS Tenta SCOM-0800 and SCOM-0801 cards Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 03/75] KVM: s390: Disable dirty log retrieval for UCONTROL guests Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 04/75] KVM: VMX: use correct vmcs_read/write for guest segment selector/base Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 05/75] Bluetooth: Add another AR3012 04ca:3018 device Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 06/75] phy: qcom-ufs: Dont kfree devres resource Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 07/75] phy: qcom-ufs: Fix misplaced jump label Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 08/75] s390/qdio: clear DSCI prior to scanning multiple input queues Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 09/75] s390/dcssblk: fix device size calculation in dcssblk_direct_access() Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 10/75] s390/kdump: Use "LINUX" ELF note name instead of "CORE" Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 11/75] s390/chsc: Add exception handler for CHSC instruction Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 12/75] s390: TASK_SIZE for kernel threads Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 13/75] s390/topology: correct allocation of topology information Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 14/75] s390: make setup_randomness work Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 15/75] s390: use correct input data address for setup_randomness Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 16/75] net: mvpp2: fix DMA address calculation in mvpp2_txq_inc_put() Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 17/75] cxl: Prevent read/write to AFU config space while AFU not configured Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 18/75] cxl: fix nested locking hang during EEH hotplug Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 19/75] brcmfmac: fix incorrect event channel deduction Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 20/75] mnt: Tuck mounts under others instead of creating shadow/side mounts Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 21/75] IB/ipoib: Fix deadlock between rmmod and set_mode Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 22/75] IB/IPoIB: Add destination address when re-queue packet Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 23/75] IB/mlx5: Fix out-of-bound access Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 24/75] IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 25/75] IB/srp: Avoid that duplicate responses trigger a kernel bug Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 26/75] IB/srp: Fix race conditions related to task management Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 27/75] fs: Better permission checking for submounts Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 28/75] Btrfs: fix data loss after truncate when using the no-holes feature Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 29/75] orangefs: Use RCU for destroy_inode Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 30/75] memory/atmel-ebi: Fix ns <-> cycles conversions Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 31/75] tracing: Fix return value check in trace_benchmark_reg() Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 32/75] ktest: Fix child exit code processing Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 33/75] ceph: remove req from unsafe list when unregistering it Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 34/75] pci/hotplug/pnv-php: Remove WARN_ON() in pnv_php_put_slot() Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 35/75] pci/hotplug/pnv-php: Disable surprise hotplug capability on conflicts Greg Kroah-Hartman
2017-03-13 8:43 ` Greg Kroah-Hartman [this message]
2017-03-13 8:43 ` [PATCH 4.10 37/75] drivers/pci/hotplug: Handle presence detection change properly Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 38/75] drivers/pci/hotplug: Fix initial state for empty slot Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 39/75] nlm: Ensure callback code also checks that the files match Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 40/75] pwm: pca9685: Fix period change with same duty cycle Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 41/75] xtensa: move parse_tag_fdt out of #ifdef CONFIG_BLK_DEV_INITRD Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 42/75] nfit, libnvdimm: fix interleave set cookie calculation Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 43/75] mac80211: flush delayed work when entering suspend Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 44/75] mac80211: dont reorder frames with SN smaller than SSN Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 45/75] mac80211: dont handle filtered frames within a BA session Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 48/75] drm/amdgpu/pm: check for headless before calling compute_clocks Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 49/75] Revert "drm/amdgpu: update tile table for oland/hainan" Greg Kroah-Hartman
2017-03-13 8:43 ` [PATCH 4.10 50/75] drm/ast: Handle configuration without P2A bridge Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 51/75] drm/ast: Fix test for VGA enabled Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 52/75] drm/ast: Call open_key before enable_mmio in POST code Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 53/75] drm/ast: Fix AST2400 POST failure without BMC FW or VBIOS Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 54/75] drm/radeon: handle vfct with multiple vbios images Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 57/75] drm/vmwgfx: Work around drm removal of control nodes Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 58/75] drm/imx: imx-tve: Do not set the regulator voltage Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 59/75] drm/atomic: fix an error code in mode_fixup() Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 60/75] drm/i915/gvt: Disable access to stolen memory as a guest Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 63/75] drm/i915: Recreate internal objects with single page segments if dmar fails Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 65/75] drm/i915: Check for timeout completion when waiting for the rq to submitted Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 66/75] drm/i915: Pass timeout==0 on to i915_gem_object_wait_fence() Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 67/75] drm/i915: Fix not finding the VBT when it overlaps with OPREGION_ASLE_EXT Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 68/75] libceph: use BUG() instead of BUG_ON(1) Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 69/75] x86, mm: fix gup_pte_range() vs DAX mappings Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 70/75] x86/tlb: Fix tlb flushing when lguest clears PGE Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 71/75] thp: fix another corner case of munlock() vs. THPs Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 72/75] mm: do not call mem_cgroup_free() from within mem_cgroup_alloc() Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 73/75] kasan: resched in quarantine_remove_cache() Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 74/75] fat: fix using uninitialized fields of fat_inode/fsinfo_inode Greg Kroah-Hartman
2017-03-13 8:44 ` [PATCH 4.10 75/75] drivers: hv: Turn off write permission on the hypercall page Greg Kroah-Hartman
2017-03-13 22:38 ` [PATCH 4.10 00/75] 4.10.3-stable review Guenter Roeck
2017-03-14 3:03 ` Greg Kroah-Hartman
[not found] ` <58c6a880.6911190a.38795.3505@mx.google.com>
2017-03-14 3:04 ` Greg Kroah-Hartman
2017-03-14 17:10 ` Kevin Hilman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170313083413.468940433@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=bryantly@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nab@linux-iscsi.org \
--cc=rlm@daterainc.com \
--cc=stable@vger.kernel.org \
--cc=vst@datera.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).