stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>,
	Johannes Thumshirn <jthumshirn@suse.de>,
	James Smart <james.smart@broadcom.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>
Subject: [PATCH 4.4 134/138] lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt()
Date: Thu, 18 Aug 2016 15:59:04 +0200	[thread overview]
Message-ID: <20160818135613.507142371@linuxfoundation.org> (raw)
In-Reply-To: <20160818135553.377018690@linuxfoundation.org>

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>

commit 05a05872c8d4b4357c9d913e6d73ae64882bddf5 upstream.

The lpfc_sli4_scmd_to_wqidx_distr() function expects the scsi_cmnd
'lpfc_cmd->pCmd' not to be null, and point to the midlayer command.

That's not true in the .eh_(device|target|bus)_reset_handler path,
because lpfc_send_taskmgmt() sends commands not from the midlayer, so
does not set 'lpfc_cmd->pCmd'.

That is true in the .queuecommand path because lpfc_queuecommand()
stores the scsi_cmnd from midlayer in lpfc_cmd->pCmd; and lpfc_cmd is
stored by lpfc_scsi_prep_cmnd() in piocbq->context1 -- which is passed
to lpfc_sli4_scmd_to_wqidx_distr() as lpfc_cmd parameter.

This problem can be hit on SCSI EH, and immediately with sg_reset.
These 2 test-cases demonstrate the problem/fix with next-20160601.

Test-case 1) sg_reset

    # strace sg_reset --device /dev/sdm
    <...>
    open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
    ioctl(3, SG_SCSI_RESET, 0x3fffde6d0994 <unfinished ...>
    +++ killed by SIGSEGV +++
    Segmentation fault

    # dmesg
    Unable to handle kernel paging request for data at address 0x00000000
    Faulting instruction address: 0xd00000001c88442c
    Oops: Kernel access of bad area, sig: 11 [#1]
    <...>
    CPU: 104 PID: 16333 Comm: sg_reset Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
    <...>
    NIP [d00000001c88442c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
    LR [d00000001c826fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
    Call Trace:
    [c000003c9ec876f0] [c000003c9ec87770] 0xc000003c9ec87770 (unreliable)
    [c000003c9ec87720] [d00000001c82e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
    [c000003c9ec87780] [d00000001c831a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
    [c000003c9ec87880] [d00000001c87f27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
    [c000003c9ec87950] [d00000001c87fd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
    [c000003c9ec87a10] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
    [c000003c9ec87a40] [c0000000006113e8] scsi_ioctl_reset+0x198/0x2c0
    [c000003c9ec87bf0] [c00000000060fe5c] scsi_ioctl+0x13c/0x4b0
    [c000003c9ec87c80] [c0000000006629b0] sd_ioctl+0xf0/0x120
    [c000003c9ec87cd0] [c00000000046e4f8] blkdev_ioctl+0x248/0xb70
    [c000003c9ec87d30] [c0000000002a1f60] block_ioctl+0x70/0x90
    [c000003c9ec87d50] [c00000000026d334] do_vfs_ioctl+0xc4/0x890
    [c000003c9ec87de0] [c00000000026db60] SyS_ioctl+0x60/0xc0
    [c000003c9ec87e30] [c000000000009120] system_call+0x38/0x108
    Instruction dump:
    <...>

    With fix:

    # strace sg_reset --device /dev/sdm
    <...>
    open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
    ioctl(3, SG_SCSI_RESET, 0x3fffe103c554) = 0
    close(3)                                = 0
    exit_group(0)                           = ?
    +++ exited with 0 +++

    # dmesg
    [  424.658649] lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (1, 0) return x2002

Test-case 2) SCSI EH

    Using this debug patch to wire an SCSI EH trigger, for lpfc_scsi_cmd_iocb_cmpl():
    -       cmd->scsi_done(cmd);
    +       if ((phba->pport ? phba->pport->cfg_log_verbose : phba->cfg_log_verbose) == 0x32100000)
    +               printk(KERN_ALERT "lpfc: skip scsi_done()\n");
    +       else
    +               cmd->scsi_done(cmd);

    # echo 0x32100000 > /sys/class/scsi_host/host11/lpfc_log_verbose

    # dd if=/dev/sdm of=/dev/null iflag=direct &
    <...>

    After a while:

    # dmesg
    lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
    lpfc: skip scsi_done()
    <...>
    Unable to handle kernel paging request for data at address 0x00000000
    Faulting instruction address: 0xd0000000199e448c
    Oops: Kernel access of bad area, sig: 11 [#1]
    <...>
    CPU: 96 PID: 28556 Comm: scsi_eh_11 Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
    <...>
    NIP [d0000000199e448c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
    LR [d000000019986fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
    Call Trace:
    [c000000ff0d0b890] [c000000ff0d0b900] 0xc000000ff0d0b900 (unreliable)
    [c000000ff0d0b8c0] [d00000001998e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
    [c000000ff0d0b920] [d000000019991a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
    [c000000ff0d0ba20] [d0000000199df27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
    [c000000ff0d0baf0] [d0000000199dfd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
    [c000000ff0d0bbb0] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
    [c000000ff0d0bbe0] [c0000000006126cc] scsi_eh_ready_devs+0x49c/0x9c0
    [c000000ff0d0bcb0] [c000000000614160] scsi_error_handler+0x580/0x680
    [c000000ff0d0bd80] [c0000000000ae848] kthread+0x108/0x130
    [c000000ff0d0be30] [c0000000000094a8] ret_from_kernel_thread+0x5c/0xb4
    Instruction dump:
    <...>

    With fix:

    # dmesg
    lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
    lpfc: skip scsi_done()
    <...>
    lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (0, 0) return x2002
    <...>
    lpfc 0006:01:00.4: 4:(0):0723 SCSI layer issued Target Reset (1, 0) return x2002
    <...>
    lpfc 0006:01:00.4: 4:(0):0714 SCSI layer issued Bus Reset Data: x2002
    <...>
    lpfc 0006:01:00.4: 4:(0):3172 SCSI layer issued Host Reset Data:
    <...>

Fixes: 8b0dff14164d ("lpfc: Add support for using block multi-queue")
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/scsi/lpfc/lpfc_scsi.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -3859,7 +3859,7 @@ int lpfc_sli4_scmd_to_wqidx_distr(struct
 	uint32_t tag;
 	uint16_t hwq;
 
-	if (shost_use_blk_mq(cmnd->device->host)) {
+	if (cmnd && shost_use_blk_mq(cmnd->device->host)) {
 		tag = blk_mq_unique_tag(cmnd->request);
 		hwq = blk_mq_unique_tag_to_hwq(tag);
 



  parent reply	other threads:[~2016-08-18 14:18 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20160818140229uscas1p28936a684c22cfb777077f1c973fad437@uscas1p2.samsung.com>
     [not found] ` <20160818135553.377018690@linuxfoundation.org>
2016-08-18 13:56   ` [PATCH 4.4 001/138] usb: gadget: avoid exposing kernel stack Greg Kroah-Hartman
2016-08-18 13:56   ` [PATCH 4.4 002/138] usb: f_fs: off by one bug in _ffs_func_bind() Greg Kroah-Hartman
2016-08-18 13:56   ` [PATCH 4.4 004/138] usb: quirks: Add no-lpm quirk for Elan Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 010/138] arm64: debug: unmask PSTATE.D earlier Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 012/138] tty: serial: msm: Dont read off end of tx fifo Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 014/138] tty/serial: atmel: fix RS485 half duplex with DMA Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 015/138] gpio: pca953x: Fix NBANK calculation for PCA9536 Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 016/138] gpio: intel-mid: Remove potentially harmful code Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 017/138] Bluetooth: hci_intel: Fix null gpio desc pointer dereference Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 018/138] pinctrl: cherryview: prevent concurrent access to GPIO controllers Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 019/138] arm64: dts: rockchip: fixes the gic400 2nd region size for rk3368 Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 020/138] arm64: mm: avoid fdt_check_header() before the FDT is fully mapped Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 022/138] KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 023/138] KVM: MTRR: fix kvm_mtrr_check_gfn_range_consistency page fault Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 044/138] EDAC: Correct channel count limit Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 046/138] ovl: disallow overlayfs as upperdir Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 047/138] remoteproc: Fix potential race condition in rproc_add Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 048/138] ARC: mm: dont loose PTE_SPECIAL in pte_modify() Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 049/138] jbd2: make journal y2038 safe Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 064/138] nfsd: dont return an unhashed lock stateid after taking mutex Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 066/138] iommu/exynos: Suppress unbinding to prevent system failure Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 067/138] iommu/vt-d: Return error code in domain_context_mapping_one() Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 068/138] iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back Greg Kroah-Hartman
2016-08-18 13:57   ` [PATCH 4.4 069/138] iommu/amd: Init unity mappings only for dma_ops domains Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 070/138] iommu/amd: Update Alias-DTE in update_device_table() Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 071/138] audit: fix a double fetch in audit_log_single_execve_arg() Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 072/138] ARM: dts: sunxi: Add a startup delay for fixed regulator enabled phys Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 074/138] w1:omap_hdq: fix regression Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 076/138] drm/amdgpu: Poll for both connect/disconnect on analog connectors Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 077/138] drm/amdgpu: support backlight control for UNIPHY3 Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 078/138] drm/amdgpu: Disable RPM helpers while reprobing connectors on resume Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 080/138] drm/amdgpu/gmc7: add missing mullins case Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 082/138] drm/radeon: Poll for both connect/disconnect on analog connectors Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 083/138] drm/radeon: fix firmware info version checks Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 086/138] drm/nouveau/gr/nv3x: fix instobj write offsets in gr setup Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 087/138] drm/nouveau/fbcon: fix font width not divisible by 8 Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 088/138] drm: Restore double clflush on the last partial cacheline Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 092/138] balloon: check the number of available pages in leak balloon Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 093/138] ftrace/recordmcount: Work around for addition of metag magic but not relocations Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 095/138] block: add missing group association in bio-cloning functions Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 096/138] block: fix bdi vs gendisk lifetime mismatch Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 097/138] mtd: nand: fix bug writing 1 byte less than page size Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 098/138] mm/hugetlb: avoid soft lockup in set_max_huge_pages() Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 099/138] ALSA: hda: Fix krealloc() with __GFP_ZERO usage Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 100/138] ALSA: hda/realtek - Cant adjust speakers volume on a Dell AIO Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 101/138] ALSA: hda: add AMD Bonaire AZ PCI ID with proper driver caps Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 102/138] ALSA: hda - Fix headset mic detection problem for two dell machines Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 103/138] IB/mlx5: Fix MODIFY_QP command input structure Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 111/138] IB/IWPM: Fix a potential skb leak Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 112/138] IB/mlx4: Fix the SQ size of an RC QP Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 113/138] IB/mlx4: Fix error flow when sending mads under SRIOV Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 114/138] IB/mlx4: Fix memory leak if QP creation failed Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 116/138] ubi: Make volume resize power cut aware Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 117/138] ubi: Fix early logging Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 124/138] target: Fix ordered task CHECK_CONDITION early exception handling Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 125/138] Input: elan_i2c - properly wake up touchpad on ASUS laptops Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 127/138] SUNRPC: Dont allocate a full sockaddr_storage for tracing Greg Kroah-Hartman
2016-08-18 13:58   ` [PATCH 4.4 129/138] MIPS: Dont register r4k sched clock when CPUFREQ enabled Greg Kroah-Hartman
2016-08-18 13:59   ` [PATCH 4.4 130/138] MIPS: hpet: Increase HPET_MIN_PROG_DELTA and decrease HPET_MIN_CYCLES Greg Kroah-Hartman
2016-08-18 13:59   ` [PATCH 4.4 133/138] ACPI / EC: Work around method reentrancy limit in ACPICA for _Qxx Greg Kroah-Hartman
2016-08-18 13:59   ` Greg Kroah-Hartman [this message]
2016-08-18 13:59   ` [PATCH 4.4 135/138] rtc: s3c: Add s3c_rtc_{enable/disable}_clk in s3c_rtc_setfreq() Greg Kroah-Hartman
2016-08-18 20:07   ` [PATCH 4.4 000/138] 4.4.19-stable review Guenter Roeck
2016-08-18 21:35   ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160818135613.507142371@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=james.smart@broadcom.com \
    --cc=jthumshirn@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=mauricfo@linux.vnet.ibm.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).