From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bart Van Assche Subject: Re: [PATCHv5 00/36] asynchronous ALUA device handler Date: Wed, 30 Sep 2015 14:32:57 -0700 Message-ID: <560C5509.6050206@sandisk.com> References: <1443523658-87622-1-git-send-email-hare@suse.de> <560AD88B.9050902@sandisk.com> <560BE1DC.9060600@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Return-path: Received: from mail-bl2on0077.outbound.protection.outlook.com ([65.55.169.77]:22808 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932707AbbI3VdF (ORCPT ); Wed, 30 Sep 2015 17:33:05 -0400 In-Reply-To: <560BE1DC.9060600@suse.de> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Hannes Reinecke , James Bottomley Cc: "linux-scsi@vger.kernel.org" , Christoph Hellwig , Ewan Milne , "Martin K. Petersen" On 09/30/2015 06:21 AM, Hannes Reinecke wrote: > On 09/29/2015 08:29 PM, Bart Van Assche wrote: >> On 09/29/2015 03:47 AM, Hannes Reinecke wrote: >>> here the next round of my update to the ALUA device handler. >> >> Sorry but this with this version I see an initiator kernel lockup >> shortly after the initiator system had been booted. I have attached >> the output of echo t > /proc/sysrq-trigger to this e-mail. >> > Hmm. Weird. > Everything seems to wait for alua_rtpg() to complete: [ ... ] Hello Hannes, Would it be possible to add the patch to your tree that causes scsi_dh_alua to be loaded automatically again (http://thread.gmane.org/gmane.linux.scsi/105276) ? I might have forgotten to load the scsi_dh_alua driver manually before I ran my test ... However, even with the scsi_dh_alua driver loaded a kernel lockup is reported. Please note that I do not know whether or not that lockup is related to this patch series or to the changes in v4.3-rc1: INFO: task srp_daemon:600 blocked for more than 120 seconds. Not tainted 4.3.0-rc1-debug+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. srp_daemon D ffff88045c6a2d00 0 600 593 0x00000000 ffff88043060b960 0000000000000092 ffffffff810ba0bd ffff88047fc95ad8 ffff88045c6a2d00 ffff880430559680 ffff88043060c000 ffff8804181061c8 ffff88041dbd3cf8 ffff8804181055e8 ffff880418104ad0 ffff88043060b978 Call Trace: [] ? trace_hardirqs_on+0xd/0x10 [] schedule+0x3a/0x90 [] blk_mq_freeze_queue_wait+0x56/0xb0 [] ? prepare_to_wait_event+0xf0/0xf0 [] blk_mq_update_tag_set_depth+0x41/0xb0 [] blk_mq_init_allocated_queue+0x7c4/0x860 [] blk_mq_init_queue+0x3a/0x60 [] scsi_mq_alloc_queue+0x1c/0x50 [scsi_mod] [] scsi_alloc_sdev+0x331/0x3b0 [scsi_mod] [] scsi_probe_and_add_lun+0x884/0xd20 [scsi_mod] [] __scsi_scan_target+0x52b/0x5f0 [scsi_mod] [] ? __pm_runtime_resume+0x5c/0x80 [] scsi_scan_target+0xdc/0x100 [scsi_mod] [] srp_create_target+0xfde/0x1410 [ib_srp] [] ? match_held_lock+0x1c1/0x200 [] dev_attr_store+0x18/0x30 [] sysfs_kf_write+0x44/0x60 [] kernfs_fop_write+0x144/0x190 [] __vfs_write+0x28/0xe0 [] ? percpu_down_read+0x5a/0x90 [] ? __sb_start_write+0xe0/0x100 [] ? __sb_start_write+0xe0/0x100 [] ? __fget+0x5/0x210 [] vfs_write+0xa9/0x190 [] SyS_write+0x49/0xa0 [] entry_SYSCALL_64_fastpath+0x16/0x7a 7 locks held by srp_daemon/600: #0: (&f->f_pos_lock){+.+.+.}, at: [] __fdget_pos+0x43/0x50 #1: (sb_writers#3){.+.+.+}, at: [] __sb_start_write+0xe0/0x100 #2: (&of->mutex){+.+.+.}, at: [] kernfs_fop_write+0x66/0x190 #3: (s_active#142){.+.+.+}, at: [] kernfs_fop_write+0x6e/0x190 #4: (&host->add_target_mutex){+.+.+.}, at: [] srp_create_target+0x13b/0x1410 [ib_srp] #5: (&shost->scan_mutex){+.+.+.}, at: [] scsi_scan_target+0x87/0x100 [scsi_mod] #6: (&set->tag_list_lock){+.+...}, at: [] blk_mq_init_allocated_queue+0x7a2/0x860 Thanks, Bart.