linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bug when rescanning partitions on -ENOMEDIA
@ 2012-02-16 16:23 Alan Stern
  2012-02-16 16:33 ` Tejun Heo
  0 siblings, 1 reply; 2+ messages in thread
From: Alan Stern @ 2012-02-16 16:23 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Huajun Li, SCSI development list

Tejun:

Please take a look at this thread:

	http://marc.info/?t=132214525000001&r=1&w=2

Evidently your commit 1196f8b814f32cd04df334abf47648c2a9fd8324 (block: 
rescan partitions on invalidated devices on -ENOMEDIA too) is causing 
crashes.  The problem is that the partition scanning code calls 
sd_revalidate_disk() after the underlying scsi_disk structure has been 
deallocated.

There have been numerous bug reports filed on multiple Bugzillas about 
this, for example, 

	https://bugzilla.redhat.com/show_bug.cgi?id=754518

I don't know the right way to fix this, but it looks like there needs
to be some synchronization between del_gendisk() and
rescan_partitions().

Or maybe you think the synchronization should be done in the SCSI layer 
instead.  But then what about non-SCSI users of the block layer?  Does 
the block layer guarantee that it won't call methods in disk->fops 
after del_gendisk() has returned?

Alan Stern


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Bug when rescanning partitions on -ENOMEDIA
  2012-02-16 16:23 Bug when rescanning partitions on -ENOMEDIA Alan Stern
@ 2012-02-16 16:33 ` Tejun Heo
  0 siblings, 0 replies; 2+ messages in thread
From: Tejun Heo @ 2012-02-16 16:33 UTC (permalink / raw)
  To: Alan Stern; +Cc: Huajun Li, SCSI development list

On Thu, Feb 16, 2012 at 11:23:22AM -0500, Alan Stern wrote:
> Tejun:
> 
> Please take a look at this thread:
> 
> 	http://marc.info/?t=132214525000001&r=1&w=2
> 
> Evidently your commit 1196f8b814f32cd04df334abf47648c2a9fd8324 (block: 
> rescan partitions on invalidated devices on -ENOMEDIA too) is causing 
> crashes.  The problem is that the partition scanning code calls 
> sd_revalidate_disk() after the underlying scsi_disk structure has been 
> deallocated.
> 
> There have been numerous bug reports filed on multiple Bugzillas about 
> this, for example, 
> 
> 	https://bugzilla.redhat.com/show_bug.cgi?id=754518
> 
> I don't know the right way to fix this, but it looks like there needs
> to be some synchronization between del_gendisk() and
> rescan_partitions().
> 
> Or maybe you think the synchronization should be done in the SCSI layer 
> instead.  But then what about non-SCSI users of the block layer?  Does 
> the block layer guarantee that it won't call methods in disk->fops 
> after del_gendisk() has returned?

No, the problem is that I used rescan_partitions() instead of
non-existing drop_partitions_and_truncate_disk().  Please read the
following thread.  Jun'ichi Nomura already proposed a patch.

 http://thread.gmane.org/gmane.linux.kernel/1250235/

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-02-16 16:33 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-16 16:23 Bug when rescanning partitions on -ENOMEDIA Alan Stern
2012-02-16 16:33 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).