From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757270Ab1JRCzW (ORCPT ); Mon, 17 Oct 2011 22:55:22 -0400 Received: from mail-iy0-f174.google.com ([209.85.210.174]:45341 "EHLO mail-iy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752497Ab1JRCzU (ORCPT ); Mon, 17 Oct 2011 22:55:20 -0400 Date: Mon, 17 Oct 2011 19:55:16 -0700 From: Tejun Heo To: Jens Axboe Cc: stable@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] block: make gendisk hold a reference to its queue Message-ID: <20111018025516.GB1493@google.com> References: <20111017024329.GA23507@google.com> <4E9C1509.6030103@kernel.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E9C1509.6030103@kernel.dk> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 17, 2011 at 01:44:09PM +0200, Jens Axboe wrote: > On 2011-10-17 04:43, Tejun Heo wrote: > > The following command sequence triggers an oops. > > > > # mount /dev/sdb1 /mnt > > # echo 1 > /sys/class/scsi_device/0\:0\:1\:0/device/delete > > # umount /mnt > > > > general protection fault: 0000 [#1] PREEMPT SMP > > CPU 2 > > Modules linked in: > > > > Pid: 791, comm: umount Not tainted 3.1.0-rc3-work+ #8 Bochs Bochs > > RIP: 0010:[] [] __lock_acquire+0x389/0x1d60 > > ... > > Call Trace: > > [] lock_acquire+0x95/0x140 > > [] _raw_spin_lock+0x3b/0x50 > > [] bdi_lock_two+0x5c/0x70 > > [] bdev_inode_switch_bdi+0x4c/0xf0 > > [] __blkdev_put+0x11b/0x1d0 > > [] __blkdev_put+0x160/0x1d0 > > [] blkdev_put+0x5f/0x190 > > [] kill_block_super+0x4d/0x80 > > [] deactivate_locked_super+0x45/0x70 > > [] deactivate_super+0x4a/0x70 > > [] mntput_no_expire+0xed/0x130 > > [] sys_umount+0x7e/0x3a0 > > [] system_call_fastpath+0x16/0x1b > > > > This is because bdev holds on to disk but disk doesn't pin the > > associated queue. If a SCSI device is removed while the device is > > still open, the sdev puts the base reference to the queue on release. > > When the bdev is finally released, the associated queue is already > > gone along with the bdi and bdev_inode_switch_bdi() ends up > > dereferencing already freed bdi. > > > > Even if it were not for this bug, disk not holding onto the associated > > queue is very unusual and error-prone. > > > > Fix it by making add_disk() take an extra reference to its queue and > > put it on disk_release() and ensuring that disk and its fops owner are > > put in that order after all accesses to the disk and queue are > > complete. > > Thanks, applied. Ooh, I had the WARN_ON_ONCE() condition wrong in add_disk() as blk_get_queue() returns 1 on failure and 0 on success. Can you please flip that? Thanks. -- tejun