From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755061AbYFWWJm (ORCPT ); Mon, 23 Jun 2008 18:09:42 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753407AbYFWWJb (ORCPT ); Mon, 23 Jun 2008 18:09:31 -0400 Received: from ogre.sisk.pl ([217.79.144.158]:43096 "EHLO ogre.sisk.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752838AbYFWWJa (ORCPT ); Mon, 23 Jun 2008 18:09:30 -0400 From: "Rafael J. Wysocki" To: LKML Subject: 2.6.26-rc7-git1: possible circular locking dependency with RAID1 Date: Tue, 24 Jun 2008 00:10:35 +0200 User-Agent: KMail/1.9.6 (enterprise 20070904.708012) Cc: Andrew Morton , Neil Brown MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200806240010.35866.rjw@sisk.pl> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Today I was rebuilding software RAID arrays on one of my boxes and I did # mdadm /dev/md4 --fail /dev/sdb7 --remove /dev/sdb7 and then I thought it was a bad idea and did # mdadm /dev/md4 --add /dev/sdb7 That resulted in the following lockdep report, although the operation seemed to have completed successfully otherwise: raid1: Disk failure on sdb7, disabling device. raid1: Operation continuing on 1 devices. RAID1 conf printout: --- wd:1 rd:2 disk 0, wo:1, o:0, dev:sdb7 disk 1, wo:0, o:1, dev:sda7 RAID1 conf printout: --- wd:1 rd:2 disk 1, wo:0, o:1, dev:sda7 md: unbind md: export_rdev(sdb7) ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.26-rc7 #196 ------------------------------------------------------- mdadm/5973 is trying to acquire lock: (&type->s_umount_key#17){----}, at: [] get_super+0x69/0xc0 but task is already holding lock: (&bdev->bd_mutex){--..}, at: [] do_open+0x72/0x320 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&bdev->bd_mutex){--..}: [] __lock_acquire+0xc44/0x10d0 [] lock_acquire+0x57/0x80 [] mutex_lock_nested+0xa3/0x290 [] do_open+0x72/0x320 [] __blkdev_get+0x8b/0xb0 [] blkdev_get+0xb/0x10 [] open_by_devnum+0x3c/0x60 [] journal_init+0x225/0xa20 [reiserfs] [] reiserfs_fill_super+0x2db/0xa20 [reiserfs] [] get_sb_bdev+0x134/0x170 [] get_super_block+0x13/0x20 [reiserfs] [] vfs_kern_mount+0x79/0x160 [] do_kern_mount+0x4e/0x100 [] do_new_mount+0x89/0xb0 [] do_mount+0x1e0/0x240 [] sys_mount+0x94/0xe0 [] system_call_after_swapgs+0x7b/0x80 [] 0xffffffffffffffff -> #0 (&type->s_umount_key#17){----}: [] __lock_acquire+0xa8e/0x10d0 [] lock_acquire+0x57/0x80 [] down_read+0x3e/0x50 [] get_super+0x69/0xc0 [] __invalidate_device+0x1f/0x60 [] check_disk_change+0x48/0x90 [] md_open+0x72/0x90 [] do_open+0x26f/0x320 [] blkdev_open+0x3e/0x80 [] __dentry_open+0xdb/0x2d0 [] nameidata_to_filp+0x44/0x60 [] do_filp_open+0x1e4/0x9e0 [] do_sys_open+0x5c/0xf0 [] sys_open+0x1b/0x20 [] system_call_after_swapgs+0x7b/0x80 [] 0xffffffffffffffff other info that might help us debug this: 1 lock held by mdadm/5973: #0: (&bdev->bd_mutex){--..}, at: [] do_open+0x72/0x320 stack backtrace: Pid: 5973, comm: mdadm Not tainted 2.6.26-rc7 #196 Call Trace: [] print_circular_bug_tail+0x83/0x90 [] __lock_acquire+0xa8e/0x10d0 [] lock_acquire+0x57/0x80 [] ? get_super+0x69/0xc0 [] down_read+0x3e/0x50 [] get_super+0x69/0xc0 [] __invalidate_device+0x1f/0x60 [] check_disk_change+0x48/0x90 [] md_open+0x72/0x90 [] do_open+0x26f/0x320 [] ? _spin_unlock+0x26/0x30 [] blkdev_open+0x3e/0x80 [] __dentry_open+0xdb/0x2d0 [] ? blkdev_open+0x0/0x80 [] nameidata_to_filp+0x44/0x60 [] do_filp_open+0x1e4/0x9e0 [] ? check_poison_obj+0x31/0x210 [] ? get_unused_fd_flags+0x105/0x130 [] do_sys_open+0x5c/0xf0 [] sys_open+0x1b/0x20 [] system_call_after_swapgs+0x7b/0x80 VFS: busy inodes on changed media. md: bind RAID1 conf printout: --- wd:1 rd:2 disk 0, wo:1, o:1, dev:sdb7 disk 1, wo:0, o:1, dev:sda7 md: recovery of RAID array md4 md: minimum _guaranteed_ speed: 1000 KB/sec/disk. md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. md: using 128k window, over a total of 33551616 blocks. Thanks, Rafael