linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ewan D. Milne" <emilne@redhat.com>
To: Steve Wise <swise@opengridcomputing.com>
Cc: linux-scsi@vger.kernel.org, yanaijie@huawei.com, Bart.VanAssche@wdc.com
Subject: RE: crash in iscsi/scsi initiator with linux-4.15.0-rc1
Date: Tue, 19 Dec 2017 15:20:13 -0500	[thread overview]
Message-ID: <1513714813.10760.153.camel@localhost.localdomain> (raw)
In-Reply-To: <014a01d378ff$f42c3d90$dc84b8b0$@opengridcomputing.com>

On Tue, 2017-12-19 at 13:31 -0600, Steve Wise wrote:
> > > Hey,
> > >
> > > I'm  seeing this null pointer dereference with linux-4.15.0-rc1.  To reproduce
> > > it, I connect two ram disks via iscsi/TCP, and start an fio:
> > >
> > > iscsiadm -m discovery --op update --type sendtargets -p 172.16.1.10:3260
> > > iscsiadm -m node -p 172.16.1.10:3260 -l
> > > ISCSI_DISKS=/dev/sdd:/dev/sde; fio --rw=randrw --name=random --
> > norandommap
> > > --ioengine=libaio --size=400m --group_reporting --exitall --fsync_on_close=1
> > > --invalidate=1 --direct=1 --filename=$ISCSI_DISKS --time_based --runtime=300
> > > --iodepth=128 --numjobs=8 --unit_base=1 --bs=64k --kb_base=1000
> > >
> > > Then on the initiator node, while the fio test is running, I detach the devices:
> > >
> > > iscsiadm -m node -p 172.16.1.10:3260 -I iser -u
> > >
> > > Then I hit this crash.  Has anyone else encountered this issue?  Wondering if
> > > there is a fix handy. :)
> > >
> > 
> > This is the same problem that is being discussed under the thread:
> > "[PATCH] scsi: fix race condition when removing target".
> > 
> > We had good test results with both Jason Yan's patch and Bart's patch
> > applied, however the ultimate solution is still in progress, see James'
> > comments.
> > 
> > You could also try reverting fbce4d97fd "scsi: fixup kernel warning
> > during rmmod()" if you just need to get past this.
> > 
> > -Ewan
> > 
> 
> Hey Ewan, Yan, Bart, 
> 
> I'm still seeing this issue with 4.15-rc4.  Is the issue still outstanding?  
> 
> Steve.
> 

Please apply the following commit from the 4.15/scsi-fixes branch of

git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git

and advise if it does not fix your issue.  It should.

----

commit 81b6c999897919d5a16fedc018fe375dbab091c5
Author: Hannes Reinecke <hare@suse.de>
Date:   Wed Dec 13 14:21:37 2017 +0100

    scsi: core: check for device state in __scsi_remove_target()
    
    As it turned out device_get() doesn't use kref_get_unless_zero(), so we
    will be always getting a device pointer.  Consequently, we need to check
    for the device state in __scsi_remove_target() to avoid tripping over
    deleted objects.
    
    Fixes: fbce4d97fd43 ("scsi: fixup kernel warning during rmmod()")
    Reported-by: Jason Yan <yanaijie@huawei.com>
    Signed-off-by: Hannes Reinecke <hare@suse.com>
    Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
    Reviewed-by: Ewan D. Milne <emilne@redhat.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

> ---
> 
> [ 1002.205103] BUG: unable to handle kernel NULL pointer dereference at           (null)
> [ 1002.213022] IP: _raw_spin_lock_irqsave+0x1e/0x40
> [ 1002.217740] PGD 0 P4D 0
> [ 1002.220382] Oops: 0002 [#1] SMP
> [ 1002.223637] Modules linked in: iw_cxgb4 cxgb4 nvme_rdma nvme_fabrics rdma_ktest(O) rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core libcxgb vfat intel_rapl fat iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd iTCO_wdt iTCO_vendor_support mxm_wmi mei_me ipmi_si lpc_ich mei pcspkr i2c_i801 mfd_core ipmi_devintf shpchp sg ipmi_msghandler wmi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 mlx4_en mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm sd_mod igb drm ahci libahci dc
 a mlx4_core
> [ 1002.295663]  ptp libata pps_core crc32c_intel nvme i2c_algo_bit i2c_core nvme_core [last unloaded: cxgb4]
> [ 1002.305563] CPU: 4 PID: 5156 Comm: fio Tainted: G           O     4.15.0-rc4 #3
> [ 1002.313223] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015
> [ 1002.320555] RIP: 0010:_raw_spin_lock_irqsave+0x1e/0x40
> [ 1002.326077] RSP: 0018:ffffc900070cbd10 EFLAGS: 00010046
> [ 1002.331692] RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000
> [ 1002.339225] RDX: 0000000000000001 RSI: ffff88085fd0e038 RDI: 0000000000000000
> [ 1002.346763] RBP: ffff880855a65f18 R08: 0000000000000000 R09: 0000000000000744
> [ 1002.354315] R10: 00000000000003ff R11: 0000000000000001 R12: ffff88084992e180
> [ 1002.361873] R13: ffff880855a67000 R14: ffff880855a65800 R15: ffff880856d7d5a8
> [ 1002.369447] FS:  0000000000000000(0000) GS:ffff88085fd00000(0000) knlGS:0000000000000000
> [ 1002.377995] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1002.384209] CR2: 0000000000000000 CR3: 0000000001c09005 CR4: 00000000000606e0
> [ 1002.391826] Call Trace:
> [ 1002.394774]  scsi_device_dev_release_usercontext+0x40/0x230
> [ 1002.400858]  execute_in_process_context+0x58/0x60
> [ 1002.406085]  device_release+0x2d/0x80
> [ 1002.410277]  kobject_cleanup+0x5e/0x180
> [ 1002.414659]  scsi_disk_put+0x2b/0x40 [sd_mod]
> [ 1002.419559]  __blkdev_put+0x1b5/0x1d0
> [ 1002.423777]  ? disk_flush_events+0x24/0x60
> [ 1002.428430]  blkdev_close+0x21/0x30
> [ 1002.432484]  __fput+0xd5/0x210
> [ 1002.436111]  task_work_run+0x82/0xa0
> [ 1002.440262]  do_exit+0x2be/0xb20
> [ 1002.444074]  ? syscall_trace_enter+0x1af/0x290
> [ 1002.449110]  do_group_exit+0x39/0xa0
> [ 1002.453287]  SyS_exit_group+0x10/0x10
> [ 1002.457557]  do_syscall_64+0x61/0x1a0
> [ 1002.461829]  entry_SYSCALL64_slow_path+0x25/0x25
> [ 1002.467064] RIP: 0033:0x7f9abb1c8529
> [ 1002.471266] RSP: 002b:00007ffe53be40d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
> [ 1002.479482] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007f9abb1c8529
> [ 1002.487279] RDX: 0000000000000005 RSI: 000000000000000a RDI: 0000000000000005
> [ 1002.495079] RBP: 00007f9a9c9de818 R08: 000000000000003c R09: 00000000000000e7
> [ 1002.502882] R10: ffffffffffffff60 R11: 0000000000000206 R12: 0000000000000006
> [ 1002.510690] R13: 0000000000000006 R14: 0000000000000000 R15: 000000000172a440
> [ 1002.518497] Code: f4 66 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 9c 58 66 66 90 66 90 48 89 c3 fa 66 66 90 66 66 90 31 c0 ba 01 00 00 00 <f0> 0f b1 17 85 c0 75 05 48 89 d8 5b c3 89 c6 e8 77 06 9e ff eb
> [ 1002.538742] RIP: _raw_spin_lock_irqsave+0x1e/0x40 RSP: ffffc900070cbd10
> [ 1002.546055] CR2: 0000000000000000
> 
> 
> ---
> This email has been checked for viruses by AVG.
> http://www.avg.com
> 

  reply	other threads:[~2017-12-19 20:20 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-01 17:00 crash in iscsi/scsi initiator with linux-4.15.0-rc1 Steve Wise
2017-12-01 18:32 ` Ewan D. Milne
2017-12-01 20:36   ` Steve Wise
2017-12-19 19:31   ` Steve Wise
2017-12-19 20:20     ` Ewan D. Milne [this message]
2017-12-20 19:05       ` Steve Wise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1513714813.10760.153.camel@localhost.localdomain \
    --to=emilne@redhat.com \
    --cc=Bart.VanAssche@wdc.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=swise@opengridcomputing.com \
    --cc=yanaijie@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).