From: Luis Henriques <luis.henriques@canonical.com>
To: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Tejun Heo <tj@kernel.org>,
Naveen Goswamy <naveen.goswamy@polymtl.ca>,
Jens Axboe <axboe@kernel.dk>,
James Bottomley <James.Bottomley@HansenPartnership.com>,
Stefan Richter <stefanr@s5r6.in-berlin.de>,
Dave Jones <davej@redhat.com>,
linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: Kernel crashing on eject SD card
Date: Thu, 1 Mar 2012 18:58:17 +0000 [thread overview]
Message-ID: <20120301185817.GA16253@zeus> (raw)
In-Reply-To: <4F3C5B4E.3080606@ce.jp.nec.com>
Hi,
On Thu, Feb 16, 2012 at 10:26:38AM +0900, Jun'ichi Nomura wrote:
> Hi,
>
> Thank you for review and comments.
>
> On 02/16/12 02:26, Tejun Heo wrote:
> > On Wed, Feb 15, 2012 at 11:56:19AM +0900, Jun'ichi Nomura wrote:
> >> +int invalidate_partitions(struct gendisk *disk, struct block_device *bdev)
> >> +{
> >> + int res;
> >> +
> >> + res = drop_partitions(disk, bdev);
> >> + if (res)
> >> + return res;
> >> +
> >
> > Hmmm... shouldn't we have set_capacity(disk, 0) here?
>
> Added.
> I wasn't sure whether I should leave it to drivers.
> But it seems capacity 0 for ENOMEDIUM device is reasonable.
>
> >> + check_disk_size_change(disk, bdev);
> >> + bdev->bd_invalidated = 0;
> >> + /* tell userspace that the media / partition table may have changed */
> >> + kobject_uevent(&disk_to_dev(disk)->kobj, KOBJ_CHANGE);
> >
> > Also, we really shouldn't be generating KOBJ_CHANGE after every
> > -ENOMEDIUM open. This can easily lead to infinite loop. We should
> > generate this iff we actually dropped partitions && modified the size.
>
> invalidate_partitions() is called only when bd_invalidated is set.
> So KOBJ_CHANGE is not raised for every ENOMEDIUM open.
>
> I put it explicit in the function to make it safer for
> possible misuse.
>
> How about this?
Are there any updates on this fix? I was wondering if any progress has been
made and if this patch has any chances of hitting mainline soon.
I have executed a quick test and it seems to solve the problem (or, at least, I
am not able to reproduce the oops anymore).
Cheers,
--
Luis
> ---------------------------------------------------------
> Do not call drivers when invalidating partitions for -ENOMEDIUM
>
> When a scsi driver returns -ENOMEDIUM for open(),
> __blkdev_get() calls rescan_partitions(), which ends up calling
> sd_revalidate_disk() without getting a refcount of scsi_device.
>
> That could lead to oops like this:
>
> process A process B
> ----------------------------------------------
> sys_open
> __blkdev_get
> sd_open
> returns -ENOMEDIUM
> scsi_remove_device
> <scsi_device torn down>
> rescan_partitions
> sd_revalidate_disk
> <oops>
>
> Oopses are reported here:
> http://marc.info/?l=linux-scsi&m=132388619710052
>
> This patch separates the partition invalidation from rescan_partitions()
> and use it for -ENOMEDIUM case.
>
> Index: linux-3.3/block/partition-generic.c
> ===================================================================
> --- linux-3.3.orig/block/partition-generic.c 2012-02-15 09:00:25.147293790 +0900
> +++ linux-3.3/block/partition-generic.c 2012-02-16 10:48:22.257680685 +0900
> @@ -389,17 +389,11 @@ static bool disk_unlock_native_capacity(
> }
> }
>
> -int rescan_partitions(struct gendisk *disk, struct block_device *bdev)
> +static int drop_partitions(struct gendisk *disk, struct block_device *bdev)
> {
> - struct parsed_partitions *state = NULL;
> struct disk_part_iter piter;
> struct hd_struct *part;
> - int p, highest, res;
> -rescan:
> - if (state && !IS_ERR(state)) {
> - kfree(state);
> - state = NULL;
> - }
> + int res;
>
> if (bdev->bd_part_count)
> return -EBUSY;
> @@ -412,6 +406,24 @@ rescan:
> delete_partition(disk, part->partno);
> disk_part_iter_exit(&piter);
>
> + return 0;
> +}
> +
> +int rescan_partitions(struct gendisk *disk, struct block_device *bdev)
> +{
> + struct parsed_partitions *state = NULL;
> + struct hd_struct *part;
> + int p, highest, res;
> +rescan:
> + if (state && !IS_ERR(state)) {
> + kfree(state);
> + state = NULL;
> + }
> +
> + res = drop_partitions(disk, bdev);
> + if (res)
> + return res;
> +
> if (disk->fops->revalidate_disk)
> disk->fops->revalidate_disk(disk);
> check_disk_size_change(disk, bdev);
> @@ -515,6 +527,26 @@ rescan:
> return 0;
> }
>
> +int invalidate_partitions(struct gendisk *disk, struct block_device *bdev)
> +{
> + int res;
> +
> + if (!bdev->bd_invalidated)
> + return 0;
> +
> + res = drop_partitions(disk, bdev);
> + if (res)
> + return res;
> +
> + set_capacity(disk, 0);
> + check_disk_size_change(disk, bdev);
> + bdev->bd_invalidated = 0;
> + /* tell userspace that the media / partition table may have changed */
> + kobject_uevent(&disk_to_dev(disk)->kobj, KOBJ_CHANGE);
> +
> + return 0;
> +}
> +
> unsigned char *read_dev_sector(struct block_device *bdev, sector_t n, Sector *p)
> {
> struct address_space *mapping = bdev->bd_inode->i_mapping;
> Index: linux-3.3/include/linux/genhd.h
> ===================================================================
> --- linux-3.3.orig/include/linux/genhd.h 2012-02-09 12:21:53.000000000 +0900
> +++ linux-3.3/include/linux/genhd.h 2012-02-16 10:47:43.783681813 +0900
> @@ -596,6 +596,7 @@ extern char *disk_name (struct gendisk *
>
> extern int disk_expand_part_tbl(struct gendisk *disk, int target);
> extern int rescan_partitions(struct gendisk *disk, struct block_device *bdev);
> +extern int invalidate_partitions(struct gendisk *disk, struct block_device *bdev);
> extern struct hd_struct * __must_check add_partition(struct gendisk *disk,
> int partno, sector_t start,
> sector_t len, int flags,
> Index: linux-3.3/fs/block_dev.c
> ===================================================================
> --- linux-3.3.orig/fs/block_dev.c 2012-02-09 12:21:53.000000000 +0900
> +++ linux-3.3/fs/block_dev.c 2012-02-16 10:47:52.602681441 +0900
> @@ -1183,8 +1183,12 @@ static int __blkdev_get(struct block_dev
> * The latter is necessary to prevent ghost
> * partitions on a removed medium.
> */
> - if (bdev->bd_invalidated && (!ret || ret == -ENOMEDIUM))
> - rescan_partitions(disk, bdev);
> + if (bdev->bd_invalidated) {
> + if (!ret)
> + rescan_partitions(disk, bdev);
> + else if (ret == -ENOMEDIUM)
> + invalidate_partitions(disk, bdev);
> + }
> if (ret)
> goto out_clear;
> } else {
> @@ -1214,8 +1218,12 @@ static int __blkdev_get(struct block_dev
> if (bdev->bd_disk->fops->open)
> ret = bdev->bd_disk->fops->open(bdev, mode);
> /* the same as first opener case, read comment there */
> - if (bdev->bd_invalidated && (!ret || ret == -ENOMEDIUM))
> - rescan_partitions(bdev->bd_disk, bdev);
> + if (bdev->bd_invalidated) {
> + if (!ret)
> + rescan_partitions(bdev->bd_disk, bdev);
> + else if (ret == -ENOMEDIUM)
> + invalidate_partitions(bdev->bd_disk, bdev);
> + }
> if (ret)
> goto out_unlock_bdev;
> }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2012-03-01 18:58 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-08 0:19 Kernel crashing on eject SD card Naveen Goswamy
2012-02-12 21:08 ` Stefan Richter
2012-02-12 21:20 ` Stefan Richter
2012-02-13 1:46 ` Naveen Goswamy
2012-02-13 2:18 ` Dave Jones
2012-02-13 17:40 ` Naveen Goswamy
2012-02-14 11:14 ` Jun'ichi Nomura
2012-02-14 13:31 ` Stefan Richter
2012-02-14 16:28 ` Tejun Heo
2012-02-15 2:56 ` Jun'ichi Nomura
2012-02-15 17:26 ` Tejun Heo
2012-02-16 1:26 ` Jun'ichi Nomura
2012-02-16 16:36 ` Tejun Heo
2012-03-01 18:58 ` Luis Henriques [this message]
2012-03-02 0:12 ` Jun'ichi Nomura
2012-03-02 9:35 ` Luis Henriques
2012-03-02 9:41 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120301185817.GA16253@zeus \
--to=luis.henriques@canonical.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=axboe@kernel.dk \
--cc=davej@redhat.com \
--cc=j-nomura@ce.jp.nec.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=naveen.goswamy@polymtl.ca \
--cc=stefanr@s5r6.in-berlin.de \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.