From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neil Brown Subject: Re: MD RAID1 deadlock on failed disk Date: Wed, 27 Oct 2010 20:52:38 +1100 Message-ID: <20101027205238.4e1a4b68@notabene> References: <0AFEJ5E11@briare1.fullpliant.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Return-path: Received: from cantor.suse.de ([195.135.220.2]:33811 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754334Ab0J0Jwq (ORCPT ); Wed, 27 Oct 2010 05:52:46 -0400 In-Reply-To: <0AFEJ5E11@briare1.fullpliant.org> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Hubert Tonneau Cc: linux-scsi@vger.kernel.org On Wed, 27 Oct 2010 10:44:02 GMT Hubert Tonneau wrote: > Hi, > > The configuration is: > Perc H200 controller configured with no RAID (mpt2sas driver), > 2 SATA disks (sda and sdb), > Linux MD Sofware RAID1 (md0), > stock Linux 2.6.35.7 kernel. > > I hotunplug the second (sdb) disk, and the result is: > . as expected, I can read sda device, > . as expected, any read to sdb device fails, > . unexpectedly, any read to md0 never returns. > > No oops or thing like that in the kernel log. > I did not try the same with other kernel releases. > > 2.6.32.24 kernel worked fine. > > Neil Brown asked for /proc/sysrq-trigger ouput, > and concluded that the problem is related to 'fw_event0'. > See his answer bellow. > > Regards, > Hubert Tonneau > > > Neil Brown wrote: > > > > The fw_event0 process is interesting. > > It seems to be hung trying to 'sync' the drive that has just been pulled. > > If that is somehow causing some IO request from the md/raid1 to be delayed > > then that would certainly hang the array. > > > > There is a section in the middle of the trace which is missing - presumably > > the sysrq-trigger output overflowed a buffer - that isn't uncommon. > > > > So I cannot see all the timing clearly. > > How long after pulling the drive was this trace taken? > > > > I suspect that you need to post this to linux-scsi@vger.kernel.org > > and ask about that fw_event0 thread - whether that should happen, whether it > > has been fixed, and whether it could delay pending IO requests. > > > > NeilBrown It probably would help to have included the sysrq-T output so the scsi people could see why I pointed the finger at fw_event0. Here is that part of the trace <6>[ 318.881486] fw_event0 D 0000000000000000 0 244 2 0x00000000 <4>[ 318.881493] ffff88081d191570 0000000000000046 ffff880800000000 00000000000158c0 <4>[ 318.881500] ffff88081d191fd8 00000000000158c0 ffff88081d191fd8 ffff88081d188000 <4>[ 318.881507] 00000000000158c0 00000000000158c0 ffff88081d191fd8 00000000000158c0 <4>[ 318.881514] Call Trace: <4>[ 318.881520] [] schedule_timeout+0x22d/0x310 <4>[ 318.881526] [] ? __scsi_queue_insert+0xb0/0x130 <4>[ 318.881533] [] wait_for_common+0xdb/0x1a0 <4>[ 318.881540] [] ? default_wake_function+0x0/0x20 <4>[ 318.881546] [] ? __generic_unplug_device+0x33/0x40 <4>[ 318.881553] [] wait_for_completion+0x1d/0x20 <4>[ 318.881560] [] blk_execute_rq+0x8e/0xf0 <4>[ 318.881567] [] ? blk_get_request+0x6c/0xa0 <4>[ 318.881573] [] scsi_execute+0xfc/0x160 <4>[ 318.881580] [] scsi_execute_req+0xac/0x180 <4>[ 318.881589] [] sd_sync_cache+0xd0/0x120 <4>[ 318.881598] [] ? printk+0x68/0x6e <4>[ 318.881604] [] sd_shutdown+0x83/0x1b0 <4>[ 318.881610] [] sd_remove+0x62/0xa0 <4>[ 318.881618] [] __device_release_driver+0x75/0xe0 <4>[ 318.881624] [] device_release_driver+0x2d/0x40 <4>[ 318.881631] [] bus_remove_device+0xb2/0xf0 <4>[ 318.881637] [] device_del+0x127/0x1b0 <4>[ 318.881644] [] __scsi_remove_device+0xb5/0xc0 <4>[ 318.881650] [] scsi_remove_device+0x30/0x50 <4>[ 318.881656] [] __scsi_remove_target+0xb1/0xe0 <4>[ 318.881662] [] ? __remove_child+0x0/0x30 <4>[ 318.881667] [] __remove_child+0x23/0x30 <4>[ 318.881673] [] device_for_each_child+0x4c/0x80 <4>[ 318.881679] [] scsi_remove_target+0x3e/0x70 <4>[ 318.881686] [] sas_rphy_remove+0x75/0x80 <4>[ 318.881692] [] sas_rphy_delete+0x16/0x30 <4>[ 318.881698] [] sas_port_delete+0x2a/0x130 <4>[ 318.881704] [] mpt2sas_transport_port_remove+0x15a/0x240 <4>[ 318.881711] [] _scsih_remove_device+0xcd/0x120 <4>[ 318.881720] [] ? default_spin_lock_flags+0x9/0x10 <4>[ 318.881726] [] ? mpt2sas_transport_update_links+0x80/0x1a0 <4>[ 318.881733] [] _firmware_event_work+0x155e/0x1af0 <4>[ 318.881742] [] ? __switch_to+0xcb/0x350 <4>[ 318.881749] [] ? finish_task_switch+0x4a/0xd0 <4>[ 318.881756] [] ? _firmware_event_work+0x0/0x1af0 <4>[ 318.881762] [] worker_thread+0x17f/0x2b0 <4>[ 318.881769] [] ? autoremove_wake_function+0x0/0x40 <4>[ 318.881775] [] ? worker_thread+0x0/0x2b0 <4>[ 318.881781] [] kthread+0x96/0xa0 <4>[ 318.881787] [] kernel_thread_helper+0x4/0x10 <4>[ 318.881794] [] ? kthread+0x0/0xa0 <4>[ 318.881799] [] ? kernel_thread_helper+0x0/0x10 It seems to hang here, and while it hangs old IO requests don't complete so md/raid1 cannot proceed. NeilBrown