* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache [not found] ` <15ddcffd0605112153q57f139a1k7068e204a3eeaf1f@mail.gmail.com> @ 2006-05-12 17:16 ` Erik Mouw 2006-05-12 17:27 ` Andrew Vasquez 2006-05-12 17:36 ` Linus Torvalds 0 siblings, 2 replies; 44+ messages in thread From: Erik Mouw @ 2006-05-12 17:16 UTC (permalink / raw) To: Or Gerlitz; +Cc: linux-scsi, rmk, axboe, linux-kernel, torvalds On Fri, May 12, 2006 at 06:53:27AM +0200, Or Gerlitz wrote: > On 5/11/06, Erik Mouw <erik@harddisk-recovery.com> wrote: > >Hi, > > > >While playing with libata in 2.6.17-git from today, I got this bug: > > > >SCSI subsystem initialized > >libata version 1.20 loaded. > >sata_promise 0000:02:05.0: version 1.04 > >kmem_cache_create: duplicate cache scsi_cmd_cache > > <c0159591> kmem_cache_create+0x331/0x390 > > <e0924371> scsi_setup_command_freelist+0x71/0xf0 [scsi_mod] > > <e092588e> scsi_host_alloc+0x17e/0x270 [scsi_mod] > > <e08fd061> ata_host_add+0x41/0xc0 [libata] > > <c0148c9f> __kzalloc+0x1f/0x50 > > <e08fd190> ata_device_add+0xb0/0x240 [libata] > > <e0839baf> pdc_ata_init_one+0x27f/0x330 [sata_promise] > > <c01dd1a9> pci_call_probe+0x19/0x20 > > <c01dd20e> __pci_device_probe+0x5e/0x70 > > <c01dd24f> pci_device_probe+0x2f/0x50 > > <c0220977> driver_probe_device+0xb7/0xe0 > > <c02b338a> klist_dec_and_del+0x1a/0x20 > > <c0220a30> __driver_attach+0x0/0x90 > > <c0220aa1>__driver_attach+0x71/0x90 > > <c021fd49> bus_for_each_dev+0x69/0x80 > > <c0220ae6> driver_attach+0x26/0x30 > > <c0220a30> __driver_attach+0x0/0x90 > > <c02202a3> bus_add_driver+0x83/0xc0 > > <c01dd4ed> __pci_register_driver+0x4d/0x70 > > <e0880017> pdc_ata_init+0x17/0x1b [sata_promise] > > <c013a1a0> sys_init_module+0x120/0x1b0 > > <c0102f27> syscall_call+0x7/0xb > > I was pointing on 2.6.17 kmem_cache related issues while back to LKML > and it turns out that you can reproduce them easily with standalone > trivial module, see > http://openib.org/pipermail/openib-general/2006-April/020582.html > > So far no one picked the issue anb i was adviced to use bisection... I tracked it down with git bisect. The culprit is this commit: 56cf6504fc1c0c221b82cebc16a444b684140fb7 is first bad commit diff-tree 56cf6504fc1c0c221b82cebc16a444b684140fb7 (from d98550e334715b2d9e45f8f0f4e1608720108640) Author: Russell King <rmk@dyn-67.arm.linux.org.uk> Date: Fri May 5 17:57:52 2006 +0100 [BLOCK] Fix oops on removal of SD/MMC card The block layer keeps a reference (driverfs_dev) to the struct device associated with the block device, and uses it internally for generating uevents in block_uevent. Block device uevents include umounting the partition, which can occur after the backing device has been removed. Unfortunately, this reference is not counted. This means that if the struct device is removed from the device tree, the block layers reference will become stale. Guard against this by holding a reference to the struct device in add_disk(), and only drop the reference when we're releasing the gendisk kobject - in other words when we can be sure that no further uevents will be generated for this block device. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: Jens Axboe <axboe@suse.de> :040000 040000 4923c988a57db93382546cd83f4e3043b06c0eed 08e396a4fbf3c7d0beb9178f0fd0c9205c0b5305 M block After reverting this commit in 2.6.17-rc4 I can't trigger the bug anymore. Might be worth fixing before 2.6.17-final. Note: I'm going on holiday next week, so I'm not able to test any fixes. However, because this bug is very easy to trigger[1], anybody with root on NFS and fully modular SCSI or libata should be able to test. Erik [1] See http://marc.theaimsgroup.com/?l=linux-scsi&m=114736053211943&w=2 -- +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 17:16 ` [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache Erik Mouw @ 2006-05-12 17:27 ` Andrew Vasquez 2006-05-12 20:18 ` Erik Mouw 2006-05-12 17:36 ` Linus Torvalds 1 sibling, 1 reply; 44+ messages in thread From: Andrew Vasquez @ 2006-05-12 17:27 UTC (permalink / raw) To: Erik Mouw; +Cc: Or Gerlitz, linux-scsi, rmk, axboe, linux-kernel, torvalds On Fri, 12 May 2006, Erik Mouw wrote: > On Fri, May 12, 2006 at 06:53:27AM +0200, Or Gerlitz wrote: > > On 5/11/06, Erik Mouw <erik@harddisk-recovery.com> wrote: > > >Hi, > > > > > >While playing with libata in 2.6.17-git from today, I got this bug: > > > > > >SCSI subsystem initialized > > >libata version 1.20 loaded. > > >sata_promise 0000:02:05.0: version 1.04 > > >kmem_cache_create: duplicate cache scsi_cmd_cache > > > <c0159591> kmem_cache_create+0x331/0x390 > > > <e0924371> scsi_setup_command_freelist+0x71/0xf0 [scsi_mod] > > > <e092588e> scsi_host_alloc+0x17e/0x270 [scsi_mod] > > > <e08fd061> ata_host_add+0x41/0xc0 [libata] > > > <c0148c9f> __kzalloc+0x1f/0x50 > > > <e08fd190> ata_device_add+0xb0/0x240 [libata] > > > <e0839baf> pdc_ata_init_one+0x27f/0x330 [sata_promise] > > > <c01dd1a9> pci_call_probe+0x19/0x20 > > > <c01dd20e> __pci_device_probe+0x5e/0x70 > > > <c01dd24f> pci_device_probe+0x2f/0x50 > > > <c0220977> driver_probe_device+0xb7/0xe0 > > > <c02b338a> klist_dec_and_del+0x1a/0x20 > > > <c0220a30> __driver_attach+0x0/0x90 > > > <c0220aa1>__driver_attach+0x71/0x90 > > > <c021fd49> bus_for_each_dev+0x69/0x80 > > > <c0220ae6> driver_attach+0x26/0x30 > > > <c0220a30> __driver_attach+0x0/0x90 > > > <c02202a3> bus_add_driver+0x83/0xc0 > > > <c01dd4ed> __pci_register_driver+0x4d/0x70 > > > <e0880017> pdc_ata_init+0x17/0x1b [sata_promise] > > > <c013a1a0> sys_init_module+0x120/0x1b0 > > > <c0102f27> syscall_call+0x7/0xb > > > > I was pointing on 2.6.17 kmem_cache related issues while back to LKML > > and it turns out that you can reproduce them easily with standalone > > trivial module, see > > http://openib.org/pipermail/openib-general/2006-April/020582.html > > > > So far no one picked the issue anb i was adviced to use bisection... > > I tracked it down with git bisect. The culprit is this commit: > > 56cf6504fc1c0c221b82cebc16a444b684140fb7 is first bad commit > diff-tree 56cf6504fc1c0c221b82cebc16a444b684140fb7 (from > d98550e334715b2d9e45f8f0f4e1608720108640) > Author: Russell King <rmk@dyn-67.arm.linux.org.uk> > Date: Fri May 5 17:57:52 2006 +0100 > > [BLOCK] Fix oops on removal of SD/MMC card > > The block layer keeps a reference (driverfs_dev) to the struct > device associated with the block device, and uses it internally > for generating uevents in block_uevent. > > Block device uevents include umounting the partition, which can > occur after the backing device has been removed. ... THat's really weird... I reported a completely unrelated problem some days ago: http://marc.theaimsgroup.com/?l=linux-kernel&m=114728598328769&w=2 > After reverting this commit in 2.6.17-rc4 I can't trigger the bug > anymore. Might be worth fixing before 2.6.17-final. and similarly reverted the commit in question... I'm still trying to track this down... > Note: I'm going on holiday next week, so I'm not able to test any > fixes. However, because this bug is very easy to trigger[1], anybody > with root on NFS and fully modular SCSI or libata should be able to > test. -- Andrew Vasquez ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 17:27 ` Andrew Vasquez @ 2006-05-12 20:18 ` Erik Mouw 0 siblings, 0 replies; 44+ messages in thread From: Erik Mouw @ 2006-05-12 20:18 UTC (permalink / raw) To: Andrew Vasquez; +Cc: Or Gerlitz, linux-scsi, rmk, axboe, linux-kernel, torvalds On Fri, May 12, 2006 at 10:27:29AM -0700, Andrew Vasquez wrote: > On Fri, 12 May 2006, Erik Mouw wrote: > > I tracked it down with git bisect. The culprit is this commit: > > > > 56cf6504fc1c0c221b82cebc16a444b684140fb7 is first bad commit > > diff-tree 56cf6504fc1c0c221b82cebc16a444b684140fb7 (from > > d98550e334715b2d9e45f8f0f4e1608720108640) > > Author: Russell King <rmk@dyn-67.arm.linux.org.uk> > > Date: Fri May 5 17:57:52 2006 +0100 > > > > [BLOCK] Fix oops on removal of SD/MMC card > > > > The block layer keeps a reference (driverfs_dev) to the struct > > device associated with the block device, and uses it internally > > for generating uevents in block_uevent. > > > > Block device uevents include umounting the partition, which can > > occur after the backing device has been removed. > ... > > THat's really weird... I reported a completely unrelated problem some > days ago: > > http://marc.theaimsgroup.com/?l=linux-kernel&m=114728598328769&w=2 AFAICS it's the same bug. Apparently you can trigger it with any scsi host driver. Erik -- +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 17:16 ` [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache Erik Mouw 2006-05-12 17:27 ` Andrew Vasquez @ 2006-05-12 17:36 ` Linus Torvalds 2006-05-12 17:47 ` James Bottomley ` (2 more replies) 1 sibling, 3 replies; 44+ messages in thread From: Linus Torvalds @ 2006-05-12 17:36 UTC (permalink / raw) To: Erik Mouw; +Cc: Or Gerlitz, linux-scsi, rmk, axboe, Linux Kernel Mailing List On Fri, 12 May 2006, Erik Mouw wrote: > On Fri, May 12, 2006 at 06:53:27AM +0200, Or Gerlitz wrote: > > > > I was pointing on 2.6.17 kmem_cache related issues while back to LKML > > and it turns out that you can reproduce them easily with standalone > > trivial module, see > > http://openib.org/pipermail/openib-general/2006-April/020582.html > > > > So far no one picked the issue anb i was adviced to use bisection... > > I tracked it down with git bisect. The culprit is this commit: > > 56cf6504fc1c0c221b82cebc16a444b684140fb7 is first bad commit Ok, that's just _strange_. Clearly bisection picked the right commit, since: > After reverting this commit in 2.6.17-rc4 I can't trigger the bug > anymore. but at the same time, I don't see what that two-liner patch to block/genhd.c has to do with that trivial module which doesn't even _use_ any of the functions that the patch affects. It sounds like you used some other test-case than the trivial module above to test bisection? So when you say "I tracked _it_ down with git bisect" (my emphasis), it sounds like "it" was really something else than the trivial stand-alone module. Or are you saying that the trivial stand-alone module _also_ got fixed? > Might be worth fixing before 2.6.17-final. Yes. We could just revert that commit, but it seems correct, and I'd really like for somebody to understand _why_ that commit matters at all. I certainly don't see the overlap here.. So I get the feeling that you bisected just the case below, right? > Note: I'm going on holiday next week, so I'm not able to test any > fixes. However, because this bug is very easy to trigger[1], anybody > with root on NFS and fully modular SCSI or libata should be able to > test. > > [1] See http://marc.theaimsgroup.com/?l=linux-scsi&m=114736053211943&w=2 Hmm. Jens? Any ideas? Linus ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 17:36 ` Linus Torvalds @ 2006-05-12 17:47 ` James Bottomley 2006-05-12 18:58 ` James Bottomley 2006-05-12 20:21 ` Erik Mouw 2006-05-12 20:34 ` Russell King 2 siblings, 1 reply; 44+ messages in thread From: James Bottomley @ 2006-05-12 17:47 UTC (permalink / raw) To: Linus Torvalds Cc: Erik Mouw, Or Gerlitz, linux-scsi, rmk, axboe, Linux Kernel Mailing List On Fri, 2006-05-12 at 10:36 -0700, Linus Torvalds wrote: > Ok, that's just _strange_. Clearly bisection picked the right commit, > since: I can guess what the problem is. The kmem_cache_release is triggered from the device model release of the host generic device. I suspect we've induced a tangle in here that means scsi devices (and hence hosts) get deleted but never released. I'll look at the release paths and see if I can work out what it is. James ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 17:47 ` James Bottomley @ 2006-05-12 18:58 ` James Bottomley 2006-05-12 19:09 ` Linus Torvalds ` (2 more replies) 0 siblings, 3 replies; 44+ messages in thread From: James Bottomley @ 2006-05-12 18:58 UTC (permalink / raw) To: Linus Torvalds Cc: Erik Mouw, Or Gerlitz, linux-scsi, rmk, axboe, Linux Kernel Mailing List On Fri, 2006-05-12 at 12:47 -0500, James Bottomley wrote: > I'll look at the release paths and see if I can work out what it is. OK, here's the scoop. The problem patch adds a get of driverfs_dev in add_disk(), but doesn't put it again until disk_release() (which occurs on final put_disk() of the gendisk). However, in SCSI, the driverfs_dev is the sdev_gendev. That means there's a reference held on sdev_gendev until final disk put. Unfortunately, we use the driver model driver_remove to trigger del_gendisk (which removes the gendisk from visibility and decrements the refcount), so we've introduced an unbreakable deadlock in the reference counting with this. I suggest simply reversing this patch at the moment. If Russell and Jens can tell me what they're trying to do I'll see if there's another way to do it. James ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 18:58 ` James Bottomley @ 2006-05-12 19:09 ` Linus Torvalds 2006-05-12 20:38 ` Russell King 2006-05-12 20:30 ` Erik Mouw 2006-05-12 20:37 ` Russell King 2 siblings, 1 reply; 44+ messages in thread From: Linus Torvalds @ 2006-05-12 19:09 UTC (permalink / raw) To: James Bottomley Cc: Erik Mouw, Or Gerlitz, linux-scsi, rmk, axboe, Andrew Vasquez, Linux Kernel Mailing List On Fri, 12 May 2006, James Bottomley wrote: > > I suggest simply reversing this patch at the moment. If Russell and > Jens can tell me what they're trying to do I'll see if there's another > way to do it. Reverted, with a big changelog entry to explain why. Linus ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 19:09 ` Linus Torvalds @ 2006-05-12 20:38 ` Russell King 2006-05-12 20:50 ` Linus Torvalds 2006-05-12 20:52 ` Greg KH 0 siblings, 2 replies; 44+ messages in thread From: Russell King @ 2006-05-12 20:38 UTC (permalink / raw) To: Linus Torvalds Cc: James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List On Fri, May 12, 2006 at 12:09:50PM -0700, Linus Torvalds wrote: > On Fri, 12 May 2006, James Bottomley wrote: > > I suggest simply reversing this patch at the moment. If Russell and > > Jens can tell me what they're trying to do I'll see if there's another > > way to do it. > > Reverted, with a big changelog entry to explain why. Great, I'm fucked by the SCSI folk again. Can we revert the patch which broke the MMC/SD layer - the one which added the mount/unmount hotplug events as well then. That way we get back to a working MMC/SD layer as well as a working SCSI layer. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 20:38 ` Russell King @ 2006-05-12 20:50 ` Linus Torvalds 2006-05-12 20:58 ` Russell King 2006-05-12 21:18 ` Greg KH 2006-05-12 20:52 ` Greg KH 1 sibling, 2 replies; 44+ messages in thread From: Linus Torvalds @ 2006-05-12 20:50 UTC (permalink / raw) To: Russell King Cc: James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List, Greg KH On Fri, 12 May 2006, Russell King wrote: > > Great, I'm fucked by the SCSI folk again. No, you introduced a regression. This isn't the SCSI layer being evil, this is the "regressions aren't acceptable". Fixing one bug and introducing another is actively _worse_ than having the same bug stay around. > Can we revert the patch which broke the MMC/SD layer - the one which > added the mount/unmount hotplug events as well then. > > That way we get back to a working MMC/SD layer as well as a working > SCSI layer. That's certainly the logical fix - push the pain up the chain to the person who introduced it. Which commit is that, do you know? Really, the added ref-count should be gotten by whoever holds on to the thing, and it sounds like it's the hotplug event that caused this and should have held on to its hotplug reference. Greg added to the Cc: list in case he already knows off-hand which commit it is.. Linus ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 20:50 ` Linus Torvalds @ 2006-05-12 20:58 ` Russell King 2006-05-12 21:27 ` Linus Torvalds 2006-05-12 21:18 ` Greg KH 1 sibling, 1 reply; 44+ messages in thread From: Russell King @ 2006-05-12 20:58 UTC (permalink / raw) To: Linus Torvalds Cc: James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List, Greg KH On Fri, May 12, 2006 at 01:50:46PM -0700, Linus Torvalds wrote: > On Fri, 12 May 2006, Russell King wrote: > > Great, I'm fucked by the SCSI folk again. > > No, you introduced a regression. This isn't the SCSI layer being evil, > this is the "regressions aren't acceptable". No - I introduced a correct fix. I think actually we're heading towards needing Linux V2 - the rewrite. It seems that fixing simple bugs cause other bugs, and that means we're heading into a maintainability nightmare. > > Can we revert the patch which broke the MMC/SD layer - the one which > > added the mount/unmount hotplug events as well then. > > > > That way we get back to a working MMC/SD layer as well as a working > > SCSI layer. > > That's certainly the logical fix - push the pain up the chain to the > person who introduced it. Which commit is that, do you know? No idea I'm afraid, and since I've had a _really_ extremely shitty couple of days I'm not about to start going to look for it. > Really, the added ref-count should be gotten by whoever holds on to the > thing, and it sounds like it's the hotplug event that caused this and > should have held on to its hotplug reference. ... which would be the genhd layer - it's keeping the driverfs_dev around so that the genhd layer can generate hotplug events using it at mount/umount time. Thanks for just re-confirming my original fix and that it's SCSI which is the real problem. 8) -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 20:58 ` Russell King @ 2006-05-12 21:27 ` Linus Torvalds 2006-05-12 21:46 ` Linus Torvalds 2006-05-12 21:51 ` Russell King 0 siblings, 2 replies; 44+ messages in thread From: Linus Torvalds @ 2006-05-12 21:27 UTC (permalink / raw) To: Russell King Cc: James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List, Greg KH On Fri, 12 May 2006, Russell King wrote: > > On Fri, May 12, 2006 at 01:50:46PM -0700, Linus Torvalds wrote: > > > > No, you introduced a regression. This isn't the SCSI layer being evil, > > this is the "regressions aren't acceptable". > > No - I introduced a correct fix. Not so. You introduced a commit that fixed one thing, and broke another thing. This is not rocket science, Russell. This is how programming works. There are dependencies on behaviour, and if you change behaviour, it can fix one problem while breaking another one, because different sides depend on different semantics. Real complexity is in linkages and communication. Surprise, surprise. The _real_ broken ends up being elsewhere. It might just be a communication error (two different subsystems got the wrong idea because of bad communication about how something _should_ be used), but it migth also be some bad design. But saying that the two one-liners were the "correct fix" is premature just because they fixed the symptoms for _you_. > It seems that fixing simple bugs cause other bugs, and that means we're > heading into a maintainability nightmare. That's just stupid, Russell. You DID NOT FIX A SIMPLE BUG. You made a change, that on the face of it _looked_ like a simple fix for a simple bug. It wasn't. No amount of you saying so makes it so. It broke something else, and suddenly what you claim was a simple and clear bug-fix obviously isn't. Ergo, it ended up not being "fixing simple bugs" at all. > No idea I'm afraid, and since I've had a _really_ extremely shitty couple > of days I'm not about to start going to look for it. Right. You're pissy. That's fine. But you having a shitty couple of days doesn't mean you need to take it out on everybody else. Especially not when people actually offered to help, and asked for details on exactly what the problem that _you_ tracked was. You depend on the exact same infrastructure that SCSI depends on. SCSI has different expectations of that infrastructure than you do. Arguably, there are more SCSI users out there than MMC users, so when you claim that it's SCSI's fault, you're being biased. It's equally possible that the fault was in the MMC layer all along, and that the MMC layers use of the structure was the thing that caused the oops for the MMC later. Since you didn't bother to even point to the oopses or the patch that caused then now, you aren't actually giving any reason to believe that your view of the situation is the only right one. So you havign a hissy-fit actually makes it harder for people to even help you. James already said "If Russell and Jens can tell me what they're trying to do I'll see if there's another way to do it." but you didn't, you just started ranting. See? Not very useful, is it? > > Really, the added ref-count should be gotten by whoever holds on to the > > thing, and it sounds like it's the hotplug event that caused this and > > should have held on to its hotplug reference. > > ... which would be the genhd layer - it's keeping the driverfs_dev > around so that the genhd layer can generate hotplug events using it > at mount/umount time. Thanks for just re-confirming my original fix > and that it's SCSI which is the real problem. 8) No. You want your fix to be the correct one, but the fact is, it may well not be that at all. Maybe - once we look at what the oops is - we find that it really _is_ the MMC layer that just doesn't follow the assumptions that the device layer was doing. I said "sounds like". And without help, we can't much help you. And being pissy about it, will just make everybody more sure that it's the MMC layer. Reference counts aren't always simple. Exactly because you can get a "recursive" reference count, where a device holds on to its reference count just by account of existing - and that is a BUG, because such a reference count will obviously never go down to zero. Now, not going down to zero will certainly hide any problems that come from the object being released early, because it will _never_ be released. And it sounds like this is actually what your patch _really_ caused. The oops went away, but perhaps at the cost of a leak? And hey, maybe your patch was proper, and the bug really _is_ in the SCSI layer. In which case we get back to that message from James that you ignored. Linus ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 21:27 ` Linus Torvalds @ 2006-05-12 21:46 ` Linus Torvalds 2006-05-12 21:51 ` Russell King 1 sibling, 0 replies; 44+ messages in thread From: Linus Torvalds @ 2006-05-12 21:46 UTC (permalink / raw) To: Russell King Cc: James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List, Greg KH On Fri, 12 May 2006, Linus Torvalds wrote: > > You introduced a commit that fixed one thing, and broke another thing. And btw, don't take that "you" personally. This happens. All the time. And definitely _not_ just to you. It's why common infrastructure can be such an incredible pain: it may be a nice common layer, but it does obviously end up affecting a hell of a lot of different devices and usages. "Private" code is often much better, and that's what we used to have. Now, sadly, I think we need that common device layer infrastructure exactly because otherwise we could never have done any global device management etc, so in this case that common interface is definitely a "necessary evil". And with that necessary evil comes the linkages that it implies. We'd all be much happier of one piece of code didn't depend on five or six other pieces of code, and a bug-fix in one place would be guaranteed to not ever have any other side effects. We'd all also be much happier if we were all young, healthy, good-looking and drive Lamborghinis. And didn't have incipient beer-bellies (not that _I_ would ever have one, of course.. Oh, no. I'm obviously talking about all you other scruffy people. Me, I'm perfect.) Sadly, neither of the above schenarios are really very realistic. So we'd love to have more information. Please? Linus ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 21:27 ` Linus Torvalds 2006-05-12 21:46 ` Linus Torvalds @ 2006-05-12 21:51 ` Russell King 2006-05-12 22:15 ` Linus Torvalds 1 sibling, 1 reply; 44+ messages in thread From: Russell King @ 2006-05-12 21:51 UTC (permalink / raw) To: Linus Torvalds Cc: James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List, Greg KH On Fri, May 12, 2006 at 02:27:19PM -0700, Linus Torvalds wrote: > [remove all the inflamatory stuff] Anyway, you asked for the original oopsen, and here they are. Enjoy. From: Todd Blumer <todd@sdgsystems.com> On a PXA27x handheld (iPAQ hx4700), when we eject a mounted SD memory card, we get a kernel panic (kernel trying to clean up non-existent device). One hack patch to avoid the panic is: --- fs/partitions/check.c 10 Apr 2006 22:57:27 -0000 1.15 +++ fs/partitions/check.c 4 May 2006 20:30:15 -0000 @@ -491,6 +491,7 @@ kfree(disk_name); } put_device(disk->driverfs_dev); + disk->driverfs_dev = 0; /* HACK - what's the right solution? */ } kobject_uevent(&disk->kobj, KOBJ_REMOVE); kobject_del(&disk->kobj); ... root@ipaq-pxa270:~# Unable to handle kernel NULL pointer dereference at virtual2pgd = c218c000 [00000002] *pgd=a20ec031, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] Modules linked in: nls_iso8859_1 nls_cp437 snd_pcm_oss snd_mixer_oss snd_hx4700rCPU: 0 PC is at strlen+0xc/0x30 LR is at kobject_get_path+0x2c/0xc0 pc : [<c00ee56c>] lr : [<c00eb724>] Not tainted sp : c21bfde4 ip : c21bfdf4 fp : c21bfdf0 r10: c21bfe28 r9 : 000007a5 r8 : c21bfe2c r7 : c3554074 r6 : 000000d0 r5 : 00000001 r4 : c3554074 r3 : 00000002 r2 : c01fa88f r1 : 00000002 r0 : 00000002 Flags: nzCv IRQs on FIQs on Mode SVC_32 Segment user Control: 397F Table: A218C000 DAC: 00000015 Process umount (pid: 2384, stack limit = 0xc21be198) Stack: (0xc21bfde4 to 0xc21c0000) fde0: c21bfe14 c21bfdf4 c00eb724 c00ee56c c34876c0 c355400c c3542b78 fe00: 0000001a c21bfe2c c21bfe58 c21bfe18 c00e2edc c00eb704 000007a5 c21bfe28 fe20: c01fa888 000000fe 00000012 00000002 c3c9585b c0226e04 c32ff7a8 c3542b60 fe40: c3c9583f c01fab8c c0226db0 c21bfeb0 c21bfe5c c00ec130 c00e2dcc c3c9585b fe60: 000007a5 c3c95800 c0226dc4 c3e15b20 00000001 00000001 00000000 00000000 fe80: ffffffff bee45e28 c0331040 c3554200 c3554200 00000000 00000000 c21bff28 fea0: c21bff20 c21bfec0 c21bfeb4 c0084d6c c00ebf1c c21bfed8 c21bfec4 c0084f04 fec0: c0084d48 c3554200 c0225d84 c21bfef0 c21bfedc c0083d90 c0084ef0 c3554200 fee0: c032bd20 c21bff08 c21bfef4 c009b804 c0083d30 c21bff28 c21be000 c21bff1c ff00: c21bff0c c008baec c009b79c 00000000 c21bff94 c21bff20 c009c0ec c008bad8 ff20: c21bff20 c21bff20 c20e73a0 c032bd20 c21bff3c c0024db0 c00ee388 00000001 ff40: 00000001 00000000 00000000 ffffffff bee45e28 00000000 00000000 c21bffb0 ff60: 00000000 00091050 c21bff9c bee44db8 000923c8 bee44db8 00000016 c001cf64 ff80: c21be000 00091050 c21bffa4 c21bff98 c009c114 c009beec 00000000 c21bffa8 ffa0: c001cdc0 c009c10c bee44db8 000923c8 bee44db8 bee44dba 000923ca 0000006d ffc0: bee44db8 000923c8 bee44db8 00000000 00090710 00000000 00091050 00000000 ffe0: 40160e70 bee44d9c 00063fc4 40160e74 60000010 bee44db8 401bd7d8 400dad9c Backtrace: [<c00ee560>] (strlen+0x0/0x30) from [<c00eb724>] (kobject_get_path+0x2c/0xc0) [<c00eb6f8>] (kobject_get_path+0x0/0xc0) from [<c00e2edc>] (block_uevent+0x11c/) r8 = C21BFE2C r7 = 0000001A r6 = C3542B78 r5 = C355400C r4 = C34876C0 [<c00e2dc0>] (block_uevent+0x0/0x1f4) from [<c00ec130>] (kobject_uevent+0x220/0)[<c00ebf10>] (kobject_uevent+0x0/0x478) from [<c0084d6c>] (bdev_uevent+0x30/0x3)[<c0084d3c>] (bdev_uevent+0x0/0x34) from [<c0084f04>] (kill_block_super+0x20/0x)[<c0084ee4>] (kill_block_super+0x0/0x3c) from [<c0083d90>] (deactivate_super+0x) r5 = C0225D84 r4 = C3554200 [<c0083d24>] (deactivate_super+0x0/0x84) from [<c009b804>] (mntput_no_expire+0x) r5 = C032BD20 r4 = C3554200 [<c009b790>] (mntput_no_expire+0x0/0xc0) from [<c008baec>] (path_release_on_umo) r5 = C21BE000 r4 = C21BFF28 [<c008bacc>] (path_release_on_umount+0x0/0x24) from [<c009c0ec>] (sys_umount+0x) r4 = 00000000 [<c009bee0>] (sys_umount+0x0/0x220) from [<c009c114>] (sys_oldumount+0x14/0x18) [<c009c100>] (sys_oldumount+0x0/0x18) from [<c001cdc0>] (ret_fast_syscall+0x0/0)Code: e89da800 e1a0c00d e92dd800 e24cb004 (e5d03000) and: From: Mikkel Erup <mikkelerup () yahoo ! com> BUG: unable to handle kernel NULL pointer dereference at virtual address 00000002 printing eip: c021570d *pde = 00000000 Oops: 0000 [#1] PREEMPT Modules linked in: nls_iso8859_1 nls_cp437 vfat fat speedstep_lib i915 drm mmc_block nvram pcmcia yenta_socket rsrc_nonstatic sdhci pcmcia_core e1000 mmc_core evdev CPU: 0 EIP: 0060:[<c021570d>] Not tainted VLI EFLAGS: 00010202 (2.6.16-git20 #3) EIP is at kobject_get_path+0x2d/0xe0 eax: 00000000 ebx: 00000001 ecx: ffffffff edx: 00000000 esi: e35e9e74 edi: 00000002 ebp: 000007a5 esp: dcea1e5c ds: 007b es: 007b ss: 0068 Process umount (pid: 3407, threadinfo=dcea0000 task=dd334a30) Stack: <0>c038cad3 dcea1ea0 00000008 dcea1e9c e2765200 e35e9e0c df304158 000007a5 c020bb5d e35e9e74 000000d0 dcea1ea0 de9ad85b 000007a5 dcea1ea4 c038cacb 000000fe 00000002 00000012 c020ba50 c03c8860 0000001a de9ad800 c0215d1e Call Trace: <c020bb5d> block_uevent+0x10d/0x210 <c020ba50> block_uevent+0x0/0x210 <c0215d1e> kobject_uevent+0x1fe/0x520 <c0166860> bdev_uevent+0x20/0x40 <c0166cc1> kill_block_super+0x21/0x50 <c0166f80> deactivate_super+0x70/0xa0 <c017fecb> sys_umount+0x4b/0x280 <c010306f> sysenter_past_esp+0x54/0x75 Code: d2 57 56 53 bb 01 00 00 00 83 ec 10 8b 74 24 24 8d b4 26 00 00 00 00 8d bc 27 00 00 00 00 8b 3e 85 ff 74 1d b9 ff ff ff ff 89 d0 <f2> ae f7 d1 49 8b 76 24 8d 2c 19 8d 5d 01 85 f6 75 e1 85 db 75 Now, can I have your permission not to do anything more tonight? Is that okay with you, or are you going to rant about that as well? -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 21:51 ` Russell King @ 2006-05-12 22:15 ` Linus Torvalds 2006-05-12 22:37 ` Linus Torvalds 0 siblings, 1 reply; 44+ messages in thread From: Linus Torvalds @ 2006-05-12 22:15 UTC (permalink / raw) To: Russell King Cc: James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List, Greg KH On Fri, 12 May 2006, Russell King wrote: > > From: Todd Blumer <todd@sdgsystems.com> > On a PXA27x handheld (iPAQ hx4700), when we eject a mounted SD memory > card, we get a kernel panic (kernel trying to clean up non-existent > device). One hack patch to avoid the panic is: > > --- fs/partitions/check.c 10 Apr 2006 22:57:27 -0000 1.15 > +++ fs/partitions/check.c 4 May 2006 20:30:15 -0000 > @@ -491,6 +491,7 @@ > kfree(disk_name); > } > put_device(disk->driverfs_dev); > + disk->driverfs_dev = 0; /* HACK - what's the right solution? */ > } > kobject_uevent(&disk->kobj, KOBJ_REMOVE); > kobject_del(&disk->kobj); Btw, on the face it of, I really think that this patch is correct regardless of any other issues. We're clearly getting rid of "disk->driverfs_dev" (that's what the "put_device()" does). So we should clear it out - using "disk->driverfs_dev" afterwards is clearly not _valid_, because we've dropped the reference. Of course, we shouldn't assign zero to it. We should assign NULL. Alternatively, we could poison it (so that anybody who tries to use it gets a nice oops ASAP). Now, regardless of whether that is correct or not, it might not be the only problem, of course. It sounds like there is something else going on here too. Linus ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 22:15 ` Linus Torvalds @ 2006-05-12 22:37 ` Linus Torvalds 2006-05-12 23:57 ` Greg KH 0 siblings, 1 reply; 44+ messages in thread From: Linus Torvalds @ 2006-05-12 22:37 UTC (permalink / raw) To: Russell King Cc: James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List, Greg KH, Todd Blumer On Fri, 12 May 2006, Linus Torvalds wrote: > > On Fri, 12 May 2006, Russell King wrote: > > > > From: Todd Blumer <todd@sdgsystems.com> > > On a PXA27x handheld (iPAQ hx4700), when we eject a mounted SD memory > > card, we get a kernel panic (kernel trying to clean up non-existent > > device). One hack patch to avoid the panic is: > > > > --- fs/partitions/check.c 10 Apr 2006 22:57:27 -0000 1.15 > > +++ fs/partitions/check.c 4 May 2006 20:30:15 -0000 > > @@ -491,6 +491,7 @@ > > kfree(disk_name); > > } > > put_device(disk->driverfs_dev); > > + disk->driverfs_dev = 0; /* HACK - what's the right solution? */ > > } > > kobject_uevent(&disk->kobj, KOBJ_REMOVE); > > kobject_del(&disk->kobj); > > Btw, on the face it of, I really think that this patch is correct > regardless of any other issues. .. and I suspect it also shows what the "other issues" are. Shouldn't that KOBJ_REMOVE uevent happen _before_ we do all the freeing of the backing dev object? That KOBJ_REMOVE thing actually seems to want to report the pathname for the disk it removes. Preferably before the thing is gone and can't be reported on.. Ie shouldn't the diff be something like this? Linus --- diff --git a/fs/partitions/check.c b/fs/partitions/check.c index 45ae7dd..7ef1f09 100644 --- a/fs/partitions/check.c +++ b/fs/partitions/check.c @@ -533,6 +533,7 @@ void del_gendisk(struct gendisk *disk) devfs_remove_disk(disk); + kobject_uevent(&disk->kobj, KOBJ_REMOVE); if (disk->holder_dir) kobject_unregister(disk->holder_dir); if (disk->slave_dir) @@ -545,7 +546,7 @@ void del_gendisk(struct gendisk *disk) kfree(disk_name); } put_device(disk->driverfs_dev); + disk->driverfs_dev = NULL; } - kobject_uevent(&disk->kobj, KOBJ_REMOVE); kobject_del(&disk->kobj); } ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 22:37 ` Linus Torvalds @ 2006-05-12 23:57 ` Greg KH 2006-05-14 16:01 ` Russell King 0 siblings, 1 reply; 44+ messages in thread From: Greg KH @ 2006-05-12 23:57 UTC (permalink / raw) To: Linus Torvalds Cc: Russell King, James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List, Todd Blumer On Fri, May 12, 2006 at 03:37:59PM -0700, Linus Torvalds wrote: > > > On Fri, 12 May 2006, Linus Torvalds wrote: > > > > On Fri, 12 May 2006, Russell King wrote: > > > > > > From: Todd Blumer <todd@sdgsystems.com> > > > On a PXA27x handheld (iPAQ hx4700), when we eject a mounted SD memory > > > card, we get a kernel panic (kernel trying to clean up non-existent > > > device). One hack patch to avoid the panic is: > > > > > > --- fs/partitions/check.c 10 Apr 2006 22:57:27 -0000 1.15 > > > +++ fs/partitions/check.c 4 May 2006 20:30:15 -0000 > > > @@ -491,6 +491,7 @@ > > > kfree(disk_name); > > > } > > > put_device(disk->driverfs_dev); > > > + disk->driverfs_dev = 0; /* HACK - what's the right solution? */ > > > } > > > kobject_uevent(&disk->kobj, KOBJ_REMOVE); > > > kobject_del(&disk->kobj); > > > > Btw, on the face it of, I really think that this patch is correct > > regardless of any other issues. > > .. and I suspect it also shows what the "other issues" are. > > Shouldn't that KOBJ_REMOVE uevent happen _before_ we do all the freeing of > the backing dev object? That KOBJ_REMOVE thing actually seems to want to > report the pathname for the disk it removes. Preferably before the thing > is gone and can't be reported on.. > > Ie shouldn't the diff be something like this? It looks sane to me. Russell, does it solve your oops too? thanks, greg k-h ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 23:57 ` Greg KH @ 2006-05-14 16:01 ` Russell King 0 siblings, 0 replies; 44+ messages in thread From: Russell King @ 2006-05-14 16:01 UTC (permalink / raw) To: Greg KH Cc: Linus Torvalds, James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List, Todd Blumer On Fri, May 12, 2006 at 04:57:45PM -0700, Greg KH wrote: > On Fri, May 12, 2006 at 03:37:59PM -0700, Linus Torvalds wrote: > > On Fri, 12 May 2006, Linus Torvalds wrote: > > > On Fri, 12 May 2006, Russell King wrote: > > > > > > > > From: Todd Blumer <todd@sdgsystems.com> > > > > On a PXA27x handheld (iPAQ hx4700), when we eject a mounted SD memory > > > > card, we get a kernel panic (kernel trying to clean up non-existent > > > > device). One hack patch to avoid the panic is: > > > > > > > > --- fs/partitions/check.c 10 Apr 2006 22:57:27 -0000 1.15 > > > > +++ fs/partitions/check.c 4 May 2006 20:30:15 -0000 > > > > @@ -491,6 +491,7 @@ > > > > kfree(disk_name); > > > > } > > > > put_device(disk->driverfs_dev); > > > > + disk->driverfs_dev = 0; /* HACK - what's the right solution? */ > > > > } > > > > kobject_uevent(&disk->kobj, KOBJ_REMOVE); > > > > kobject_del(&disk->kobj); > > > > > > Btw, on the face it of, I really think that this patch is correct > > > regardless of any other issues. > > > > .. and I suspect it also shows what the "other issues" are. > > > > Shouldn't that KOBJ_REMOVE uevent happen _before_ we do all the freeing of > > the backing dev object? That KOBJ_REMOVE thing actually seems to want to > > report the pathname for the disk it removes. Preferably before the thing > > is gone and can't be reported on.. > > > > Ie shouldn't the diff be something like this? > > It looks sane to me. Russell, does it solve your oops too? It appears to. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 20:50 ` Linus Torvalds 2006-05-12 20:58 ` Russell King @ 2006-05-12 21:18 ` Greg KH 2006-05-12 21:32 ` Linus Torvalds 1 sibling, 1 reply; 44+ messages in thread From: Greg KH @ 2006-05-12 21:18 UTC (permalink / raw) To: Linus Torvalds Cc: Russell King, James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List On Fri, May 12, 2006 at 01:50:46PM -0700, Linus Torvalds wrote: > On Fri, 12 May 2006, Russell King wrote: > > Can we revert the patch which broke the MMC/SD layer - the one which > > added the mount/unmount hotplug events as well then. > > > > That way we get back to a working MMC/SD layer as well as a working > > SCSI layer. > > That's certainly the logical fix - push the pain up the chain to the > person who introduced it. Which commit is that, do you know? > > Really, the added ref-count should be gotten by whoever holds on to the > thing, and it sounds like it's the hotplug event that caused this and > should have held on to its hotplug reference. > > Greg added to the Cc: list in case he already knows off-hand which commit > it is.. No, I don't know, that's why I just asked :) And this bug doesn't have anything to do with why my mmc/sd cards are suddenly not showing up at all anymore in my laptop, right? I need to track that regression from 2.6.17-rc1 down... thanks, greg k-h ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 21:18 ` Greg KH @ 2006-05-12 21:32 ` Linus Torvalds 2006-05-12 22:45 ` Greg KH 0 siblings, 1 reply; 44+ messages in thread From: Linus Torvalds @ 2006-05-12 21:32 UTC (permalink / raw) To: Greg KH Cc: Russell King, James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List On Fri, 12 May 2006, Greg KH wrote: > > No, I don't know, that's why I just asked :) > > And this bug doesn't have anything to do with why my mmc/sd cards are > suddenly not showing up at all anymore in my laptop, right? I need to > track that regression from 2.6.17-rc1 down... Well, that's certainly an interesting regression to look into too.. Here's all I know as back-ground.. Russell's patch certainly _looked_ ok, which is why it then got acked and merged, but that was before we had multiple people reporting that it breaks things for them. Linus -- On Thu, 4 May 2006, Russell King wrote: > > Linus, > > This has been confirmed to fix an issue which Mikkel discovered, and > I'm now seeing reports from other people about this. > > I'm not getting any response on it from either Al or Jens and it > appears to be otherwise ignored - help! Would you take this patch > direct from me? > > ----- Forwarded message from Russell King <rmk+lkml@arm.linux.org.uk> ----- > > Date: Fri, 7 Apr 2006 15:40:46 +0100 > From: Russell King <rmk+lkml@arm.linux.org.uk> > To: Pierre Ossman <drzeus-list@drzeus.cx>, > Al Viro <viro@ftp.uk.linux.org>, Jens Axboe <axboe@suse.de> > Cc: Mikkel Erup <mikkelerup@yahoo.com>, Greg KH <greg@kroah.com>, > linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org> > Subject: Re: sdhci driver produces kernel oops on ejecting the card > > On Fri, Apr 07, 2006 at 10:47:54AM +0200, Pierre Ossman wrote: > > Mikkel Erup wrote: > > > > > > It happens with 2.6.16-git20 as well. > > > Attached are the log file and kernel .config > > > > > > > Since it ooopses during umount, I'm guessing that it's a problem > > somewhere in mmc_block. I'll try to get some time to look closer at it > > during the weekend. Perhaps Russell has some idea until then? > > $ grep -n driverfs_dev block/genhd.c > 558: physdev = disk->driverfs_dev; > $ > > Hmm, okay, genhd contains a reference to a device object, but there's > no sign of _any_ refcounting in sight. > > What's happening is that the MMC card block device is setup and registered > with genhd. We set md->disk->driverfs_dev to point at the owning device > structure. > > This generates a uevent, which causes disk->driverfs_dev to be dereferenced. > All fine here. We mount the partition, which causes another uevent to be > generated, again dereferencing disk->driverfs_dev. > > If we remove the MMC card, we destroy the MMC card block device. This > seems to generate another uevent for the block device. At this point, > the counted references to the MMC card block device fall to zero. > > But wait! There's still uncounted disk->driverfs_dev reference waiting > for... > > You unmount the partition. This calls block_uevent, which dereferences > disk->driverfs_dev. You know what happens now. > > The levels above genhd can't do the refcounting because they don't know > when stuff has finished with driverfs_dev. So the only place for sane > refcounting seems to be genhd.c, as per the patch below. > > Comments? > > diff -up -x BitKeeper -x ChangeSet -x SCCS -x _xlk -x *.orig -x *.rej -x .git a/block/genhd.c b/block/genhd.c > --- a/block/genhd.c Sat Feb 18 10:31:37 2006 > +++ b/block/genhd.c Fri Apr 7 15:22:21 2006 > @@ -262,6 +262,7 @@ static int exact_lock(dev_t dev, void *d > */ > void add_disk(struct gendisk *disk) > { > + get_device(disk->driverfs_dev); > disk->flags |= GENHD_FL_UP; > blk_register_region(MKDEV(disk->major, disk->first_minor), > disk->minors, NULL, exact_match, exact_lock, disk); > @@ -507,6 +508,7 @@ static struct attribute * default_attrs[ > static void disk_release(struct kobject * kobj) > { > struct gendisk *disk = to_disk(kobj); > + put_device(disk->driverfs_dev); > kfree(disk->random); > kfree(disk->part); > free_disk_stats(disk); > > > -- > Russell King > Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ > maintainer of: 2.6 Serial core > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > ----- End forwarded message ----- > > -- > Russell King > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 21:32 ` Linus Torvalds @ 2006-05-12 22:45 ` Greg KH 0 siblings, 0 replies; 44+ messages in thread From: Greg KH @ 2006-05-12 22:45 UTC (permalink / raw) To: Linus Torvalds Cc: Russell King, James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List On Fri, May 12, 2006 at 02:32:05PM -0700, Linus Torvalds wrote: > > > On Fri, 12 May 2006, Greg KH wrote: > > > > No, I don't know, that's why I just asked :) > > > > And this bug doesn't have anything to do with why my mmc/sd cards are > > suddenly not showing up at all anymore in my laptop, right? I need to > > track that regression from 2.6.17-rc1 down... > > Well, that's certainly an interesting regression to look into too.. No, nevermind, I got this to work again. Turns out you need to have a SD card in the reader when loading the sdhci module for it to work properly. I'll take it up with the author of the driver as to if this is "correct" or not. thanks, greg k-h ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 20:38 ` Russell King 2006-05-12 20:50 ` Linus Torvalds @ 2006-05-12 20:52 ` Greg KH 2006-05-12 21:03 ` Russell King 1 sibling, 1 reply; 44+ messages in thread From: Greg KH @ 2006-05-12 20:52 UTC (permalink / raw) To: Linus Torvalds, James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List On Fri, May 12, 2006 at 09:38:50PM +0100, Russell King wrote: > On Fri, May 12, 2006 at 12:09:50PM -0700, Linus Torvalds wrote: > > On Fri, 12 May 2006, James Bottomley wrote: > > > I suggest simply reversing this patch at the moment. If Russell and > > > Jens can tell me what they're trying to do I'll see if there's another > > > way to do it. > > > > Reverted, with a big changelog entry to explain why. > > Great, I'm fucked by the SCSI folk again. > > Can we revert the patch which broke the MMC/SD layer - the one which > added the mount/unmount hotplug events as well then. Why would the mount/unmount hotplug event change break MMC/SD? Do you have a reference to the patch in question? thanks, greg k-h ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 20:52 ` Greg KH @ 2006-05-12 21:03 ` Russell King 2006-05-12 21:34 ` Greg KH 2006-05-12 21:39 ` Al Viro 0 siblings, 2 replies; 44+ messages in thread From: Russell King @ 2006-05-12 21:03 UTC (permalink / raw) To: Greg KH Cc: Linus Torvalds, James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List On Fri, May 12, 2006 at 01:52:15PM -0700, Greg KH wrote: > On Fri, May 12, 2006 at 09:38:50PM +0100, Russell King wrote: > > On Fri, May 12, 2006 at 12:09:50PM -0700, Linus Torvalds wrote: > > > On Fri, 12 May 2006, James Bottomley wrote: > > > > I suggest simply reversing this patch at the moment. If Russell and > > > > Jens can tell me what they're trying to do I'll see if there's another > > > > way to do it. > > > > > > Reverted, with a big changelog entry to explain why. > > > > Great, I'm fucked by the SCSI folk again. > > > > Can we revert the patch which broke the MMC/SD layer - the one which > > added the mount/unmount hotplug events as well then. > > Why would the mount/unmount hotplug event change break MMC/SD? Do you > have a reference to the patch in question? Please read the commit message in the change in question. The block layer holds on to a reference to a struct device which isn't refcounted (until I added it with my patch.) Hence struct gendisk structures have a completely independent lifetime and are only destroyed when all references to them are removed. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 21:03 ` Russell King @ 2006-05-12 21:34 ` Greg KH 2006-05-12 21:39 ` Al Viro 1 sibling, 0 replies; 44+ messages in thread From: Greg KH @ 2006-05-12 21:34 UTC (permalink / raw) To: Linus Torvalds, James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List On Fri, May 12, 2006 at 10:03:01PM +0100, Russell King wrote: > On Fri, May 12, 2006 at 01:52:15PM -0700, Greg KH wrote: > > On Fri, May 12, 2006 at 09:38:50PM +0100, Russell King wrote: > > > On Fri, May 12, 2006 at 12:09:50PM -0700, Linus Torvalds wrote: > > > > On Fri, 12 May 2006, James Bottomley wrote: > > > > > I suggest simply reversing this patch at the moment. If Russell and > > > > > Jens can tell me what they're trying to do I'll see if there's another > > > > > way to do it. > > > > > > > > Reverted, with a big changelog entry to explain why. > > > > > > Great, I'm fucked by the SCSI folk again. > > > > > > Can we revert the patch which broke the MMC/SD layer - the one which > > > added the mount/unmount hotplug events as well then. > > > > Why would the mount/unmount hotplug event change break MMC/SD? Do you > > have a reference to the patch in question? > > Please read the commit message in the change in question. Hm, I did. I can't seem to find the offending patch you refer to, have a changeset number? thanks, greg k-h ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 21:03 ` Russell King 2006-05-12 21:34 ` Greg KH @ 2006-05-12 21:39 ` Al Viro 1 sibling, 0 replies; 44+ messages in thread From: Al Viro @ 2006-05-12 21:39 UTC (permalink / raw) To: Greg KH, Linus Torvalds, James Bottomley, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Andrew Vasquez, Linux Kernel Mailing List On Fri, May 12, 2006 at 10:03:01PM +0100, Russell King wrote: > The block layer holds on to a reference to a struct device which > isn't refcounted (until I added it with my patch.) Hence struct > gendisk structures have a completely independent lifetime and are > only destroyed when all references to them are removed. Yes, they are and that's intentional. Can you explain WTF do you drop that reference so late and not in the del_gendisk() time? ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 18:58 ` James Bottomley 2006-05-12 19:09 ` Linus Torvalds @ 2006-05-12 20:30 ` Erik Mouw 2006-05-12 20:37 ` Russell King 2 siblings, 0 replies; 44+ messages in thread From: Erik Mouw @ 2006-05-12 20:30 UTC (permalink / raw) To: James Bottomley Cc: Linus Torvalds, Or Gerlitz, linux-scsi, rmk, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 01:58:45PM -0500, James Bottomley wrote: > On Fri, 2006-05-12 at 12:47 -0500, James Bottomley wrote: > > I'll look at the release paths and see if I can work out what it is. > > OK, here's the scoop. The problem patch adds a get of driverfs_dev in > add_disk(), but doesn't put it again until disk_release() (which occurs > on final put_disk() of the gendisk). > > However, in SCSI, the driverfs_dev is the sdev_gendev. That means > there's a reference held on sdev_gendev until final disk put. > Unfortunately, we use the driver model driver_remove to trigger > del_gendisk (which removes the gendisk from visibility and decrements > the refcount), so we've introduced an unbreakable deadlock in the > reference counting with this. That explains why I could only trigger the bug with a disk attached to the controller. Thanks for figuring out, Erik -- +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 18:58 ` James Bottomley 2006-05-12 19:09 ` Linus Torvalds 2006-05-12 20:30 ` Erik Mouw @ 2006-05-12 20:37 ` Russell King 2 siblings, 0 replies; 44+ messages in thread From: Russell King @ 2006-05-12 20:37 UTC (permalink / raw) To: James Bottomley Cc: Linus Torvalds, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 01:58:45PM -0500, James Bottomley wrote: > I suggest simply reversing this patch at the moment. If Russell and > Jens can tell me what they're trying to do I'll see if there's another > way to do it. When a MMC card is pulled, we remove the MMC device structure (which is what the driverfs_dev points at.) At this point, the MMC layer *totally* forgets about the MMC device and deletes it. Unfortunately, an uncounted reference is kept while the partition is mounted by the gendisk layer, which when the partition is unmounted via hotplug causes another hotplug event to be generated with respect to this freed MMC device structure, and hence you get an oops. Since the MMC layer has lost all knowledge of the device, the only possible solution is as given in that patch. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 17:36 ` Linus Torvalds 2006-05-12 17:47 ` James Bottomley @ 2006-05-12 20:21 ` Erik Mouw 2006-05-12 20:34 ` Russell King 2 siblings, 0 replies; 44+ messages in thread From: Erik Mouw @ 2006-05-12 20:21 UTC (permalink / raw) To: Linus Torvalds Cc: Or Gerlitz, linux-scsi, rmk, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 10:36:57AM -0700, Linus Torvalds wrote: > It sounds like you used some other test-case than the trivial module above > to test bisection? Yes, my test case is in here: http://marc.theaimsgroup.com/?l=linux-scsi&m=114736053211943&w=2 > So when you say "I tracked _it_ down with git bisect" (my emphasis), it > sounds like "it" was really something else than the trivial stand-alone > module. Or are you saying that the trivial stand-alone module _also_ got > fixed? Sorry for being unclear. "it" is the bug report I send to linux-scsi (see url above). Erik -- +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 17:36 ` Linus Torvalds 2006-05-12 17:47 ` James Bottomley 2006-05-12 20:21 ` Erik Mouw @ 2006-05-12 20:34 ` Russell King 2006-05-12 21:43 ` Al Viro 2 siblings, 1 reply; 44+ messages in thread From: Russell King @ 2006-05-12 20:34 UTC (permalink / raw) To: Linus Torvalds Cc: Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 10:36:57AM -0700, Linus Torvalds wrote: > Yes. We could just revert that commit, but it seems correct, and I'd > really like for somebody to understand _why_ that commit matters at all. I > certainly don't see the overlap here.. Reverting the commit breaks MMC/SD in a very real way, and the fix is plainly correct and is actually the only possible fix that can be applied. It sounds to me like SCSI is relying on some buggy behaviour which is specific to the way that the kernel works with the fix removed. Maybe thing is kfree'ing and then reallocating something which remained registered somewhere when it shouldn't do? -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 20:34 ` Russell King @ 2006-05-12 21:43 ` Al Viro 2006-05-12 21:55 ` Russell King 2006-05-12 21:58 ` Al Viro 0 siblings, 2 replies; 44+ messages in thread From: Al Viro @ 2006-05-12 21:43 UTC (permalink / raw) To: Linus Torvalds, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 09:34:16PM +0100, Russell King wrote: > On Fri, May 12, 2006 at 10:36:57AM -0700, Linus Torvalds wrote: > > Yes. We could just revert that commit, but it seems correct, and I'd > > really like for somebody to understand _why_ that commit matters at all. I > > certainly don't see the overlap here.. > > Reverting the commit breaks MMC/SD in a very real way, and the fix > is plainly correct and is actually the only possible fix that can be > applied. Bullshit. Could you explain what generic code dereferences ->driverfs_dev after del_gendisk()? If you see such beast, please tell; _that_ is the real bug. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 21:43 ` Al Viro @ 2006-05-12 21:55 ` Russell King 2006-05-12 22:08 ` Al Viro 2006-05-12 21:58 ` Al Viro 1 sibling, 1 reply; 44+ messages in thread From: Russell King @ 2006-05-12 21:55 UTC (permalink / raw) To: Al Viro Cc: Linus Torvalds, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 10:43:54PM +0100, Al Viro wrote: > On Fri, May 12, 2006 at 09:34:16PM +0100, Russell King wrote: > > On Fri, May 12, 2006 at 10:36:57AM -0700, Linus Torvalds wrote: > > > Yes. We could just revert that commit, but it seems correct, and I'd > > > really like for somebody to understand _why_ that commit matters at all. I > > > certainly don't see the overlap here.. > > > > Reverting the commit breaks MMC/SD in a very real way, and the fix > > is plainly correct and is actually the only possible fix that can be > > applied. > > Bullshit. Could you explain what generic code dereferences ->driverfs_dev > after del_gendisk()? If you see such beast, please tell; _that_ is the > real bug. Al, I think you're going to eat your own bull on this one. You'll find that in the reply I've just sent to Linus - two oopen reported by two different people since the mount/umount hotplug events got re-merged. The problem case is: - insert card - mount filesystem - remove card - umount filesystem <bang, oops> The generic code is block_uevent(), which is called at umount time _after_ the gendisk has been deleted (which happens when the card has been removed.) Basically, you can't umount a destroyed block device without oopsing. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 21:55 ` Russell King @ 2006-05-12 22:08 ` Al Viro 2006-05-12 22:22 ` Linus Torvalds 0 siblings, 1 reply; 44+ messages in thread From: Al Viro @ 2006-05-12 22:08 UTC (permalink / raw) To: Linus Torvalds, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 10:55:20PM +0100, Russell King wrote: > On Fri, May 12, 2006 at 10:43:54PM +0100, Al Viro wrote: > > On Fri, May 12, 2006 at 09:34:16PM +0100, Russell King wrote: > > > On Fri, May 12, 2006 at 10:36:57AM -0700, Linus Torvalds wrote: > > > > Yes. We could just revert that commit, but it seems correct, and I'd > > > > really like for somebody to understand _why_ that commit matters at all. I > > > > certainly don't see the overlap here.. > > > > > > Reverting the commit breaks MMC/SD in a very real way, and the fix > > > is plainly correct and is actually the only possible fix that can be > > > applied. > > > > Bullshit. Could you explain what generic code dereferences ->driverfs_dev > > after del_gendisk()? If you see such beast, please tell; _that_ is the > > real bug. > > Al, I think you're going to eat your own bull on this one. > > You'll find that in the reply I've just sent to Linus - two oopen > reported by two different people since the mount/umount hotplug > events got re-merged. > > The problem case is: > > - insert card > - mount filesystem > - remove card > - umount filesystem <bang, oops> > > The generic code is block_uevent(), which is called at umount time > _after_ the gendisk has been deleted (which happens when the card has > been removed.) > > Basically, you can't umount a destroyed block device without oopsing. So block_uevent() is bogus; great, we've located the real bug. The real question: what uses these events and what can userland _do_ when it gets something about a device that had been gone for a month? Secondary question: who had resurrected that crap? I distinctly remember killing it off... ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 22:08 ` Al Viro @ 2006-05-12 22:22 ` Linus Torvalds 2006-05-12 22:28 ` Al Viro 2006-05-12 22:37 ` Russell King 0 siblings, 2 replies; 44+ messages in thread From: Linus Torvalds @ 2006-05-12 22:22 UTC (permalink / raw) To: Al Viro; +Cc: Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, 12 May 2006, Al Viro wrote: > > Secondary question: who had resurrected that crap? I distinctly remember > killing it off... If you did, I don't think it ever got into the kernel. It was added by Kay Sievers on Nov 3, 2004, according to the old history (back then it was in drivers/block/genhd.c, and the function was called "block_hotplug()", but apart from renaming the function and moving the file, it's recognizably the same. Of course, you may have killed off an even earlier incarnation.. Linus ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 22:22 ` Linus Torvalds @ 2006-05-12 22:28 ` Al Viro 2006-05-12 22:48 ` Al Viro 2006-05-13 0:08 ` Greg KH 2006-05-12 22:37 ` Russell King 1 sibling, 2 replies; 44+ messages in thread From: Al Viro @ 2006-05-12 22:28 UTC (permalink / raw) To: Linus Torvalds Cc: Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 03:22:42PM -0700, Linus Torvalds wrote: > > > On Fri, 12 May 2006, Al Viro wrote: > > > > Secondary question: who had resurrected that crap? I distinctly remember > > killing it off... > > If you did, I don't think it ever got into the kernel. > > It was added by Kay Sievers on Nov 3, 2004, according to the old history > (back then it was in drivers/block/genhd.c, and the function was called > "block_hotplug()", but apart from renaming the function and moving the > file, it's recognizably the same. > > Of course, you may have killed off an even earlier incarnation.. Ah, right - there was another uevent mess in fs/super.c. Sorry, got them confused... and that FPOS _is_ back. gregkh, may I ask why had bdev_uevent() been resurrected? ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 22:28 ` Al Viro @ 2006-05-12 22:48 ` Al Viro 2006-05-12 22:51 ` Al Viro 2006-05-13 0:08 ` Greg KH 1 sibling, 1 reply; 44+ messages in thread From: Al Viro @ 2006-05-12 22:48 UTC (permalink / raw) To: Linus Torvalds Cc: Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 11:28:16PM +0100, Al Viro wrote: > Ah, right - there was another uevent mess in fs/super.c. Sorry, got them > confused... and that FPOS _is_ back. Actually... What happens is that turd in fs/super.c (one that should not have been resurrected) triggers call of block_uevent(). Which uses ->driverfs_dev. Which is a bug. So IMO we should simply kill that animal _again_, and see if block_uevent() is actually need for anything on its own. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 22:48 ` Al Viro @ 2006-05-12 22:51 ` Al Viro 2006-05-12 23:04 ` Linus Torvalds 0 siblings, 1 reply; 44+ messages in thread From: Al Viro @ 2006-05-12 22:51 UTC (permalink / raw) To: Linus Torvalds Cc: Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 11:48:04PM +0100, Al Viro wrote: > On Fri, May 12, 2006 at 11:28:16PM +0100, Al Viro wrote: > > Ah, right - there was another uevent mess in fs/super.c. Sorry, got them > > confused... and that FPOS _is_ back. > > Actually... > > What happens is that turd in fs/super.c (one that should not have been > resurrected) triggers call of block_uevent(). Which uses ->driverfs_dev. > Which is a bug. > > So IMO we should simply kill that animal _again_, and see if block_uevent() > is actually need for anything on its own. PS: it _is_ OK to trigger than puppy from add_disk()/del_gendisk() and in between. I'm not sure if anyone needs it for anything, though. Triggering it from bdev_uevent() is definitely bogus. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 22:51 ` Al Viro @ 2006-05-12 23:04 ` Linus Torvalds 2006-05-12 23:21 ` Al Viro 0 siblings, 1 reply; 44+ messages in thread From: Linus Torvalds @ 2006-05-12 23:04 UTC (permalink / raw) To: Al Viro; +Cc: Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, 12 May 2006, Al Viro wrote: > > PS: it _is_ OK to trigger than puppy from add_disk()/del_gendisk() and in > between. I'm not sure if anyone needs it for anything, though. Triggering > it from bdev_uevent() is definitely bogus. Wouldn't it be perfectly ok if we just made sure to keep the disk device refcount elevated by the _mount_? Ie Russell's patch to elevate it around everything didn't work, but elevating the bdev->bd_disk reference count by a mount, and dropping it on umount (after doing the umount event) should be ok. No? Linus ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 23:04 ` Linus Torvalds @ 2006-05-12 23:21 ` Al Viro 2006-05-12 23:37 ` Al Viro 0 siblings, 1 reply; 44+ messages in thread From: Al Viro @ 2006-05-12 23:21 UTC (permalink / raw) To: Linus Torvalds Cc: Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 04:04:53PM -0700, Linus Torvalds wrote: > > > On Fri, 12 May 2006, Al Viro wrote: > > > > PS: it _is_ OK to trigger than puppy from add_disk()/del_gendisk() and in > > between. I'm not sure if anyone needs it for anything, though. Triggering > > it from bdev_uevent() is definitely bogus. > > Wouldn't it be perfectly ok if we just made sure to keep the disk device > refcount elevated by the _mount_? > > Ie Russell's patch to elevate it around everything didn't work, but > elevating the bdev->bd_disk reference count by a mount, and dropping it on > umount (after doing the umount event) should be ok. No? ->bd_disk _is_ pinned down. However, it's not the problem - we keep gendisk for as long as anyone uses it. Which might be after the underlying object is gone; that one is controlled by driver and driver calls del_gendisk() when it decides that we are through. The problem is with disk->driverfs_dev, not disk itself. Block layer has no fscking business touching it after del_gendisk() - if nothing else, we might have _no_ underlying object at all from the very beginning. So anything that wants events related to partitions, let alone mounting, can't expect to see PHYSDEV... crap. Moreover, it can bloody well get to PHYSDEV... itself *if* it wants to and if it's there. There's a reason why we have that symlink in sys/block/<device> and userland can bloody well access it on its own. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 23:21 ` Al Viro @ 2006-05-12 23:37 ` Al Viro 2006-05-12 23:50 ` Linus Torvalds 0 siblings, 1 reply; 44+ messages in thread From: Al Viro @ 2006-05-12 23:37 UTC (permalink / raw) To: Linus Torvalds Cc: Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Sat, May 13, 2006 at 12:21:31AM +0100, Al Viro wrote: > The problem is with disk->driverfs_dev, not disk itself. Block layer > has no fscking business touching it after del_gendisk() - if nothing else, > we might have _no_ underlying object at all from the very beginning. > > So anything that wants events related to partitions, let alone mounting, > can't expect to see PHYSDEV... crap. Moreover, it can bloody well > get to PHYSDEV... itself *if* it wants to and if it's there. There's > a reason why we have that symlink in sys/block/<device> and userland can > bloody well access it on its own. BTW, the best option is to kill bdev_uevent() again. Short of that, skip PHYSDEV mess if disk doesn't have GENHD_FL_UP. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 23:37 ` Al Viro @ 2006-05-12 23:50 ` Linus Torvalds 2006-05-13 0:20 ` Al Viro 2006-05-24 14:07 ` Erik Mouw 0 siblings, 2 replies; 44+ messages in thread From: Linus Torvalds @ 2006-05-12 23:50 UTC (permalink / raw) To: Al Viro; +Cc: Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Sat, 13 May 2006, Al Viro wrote: > > BTW, the best option is to kill bdev_uevent() again. Short of that, > skip PHYSDEV mess if disk doesn't have GENHD_FL_UP. I do think the mount/umount events are valid and interesting, so I'd much rather see the second version. However, that does beg the question: wouldn't that effectively be what the patch I posted would do? Notably the "disk->driverfs_dev = NULL" part after we've dropped it (the "KOBJ_REMOVE" event move is a separate issue, mixed here into the same patch, but should result in possibly better name generation for the event). Basically, onces driverfs_dev has been dropped, we NULL it out, and then the people who use it automatically get the right result. Yes? No? "You're a total klutz, Linus, that patch won't actually do anything, because <xyz>"? Linus --- diff --git a/fs/partitions/check.c b/fs/partitions/check.c index 45ae7dd..7ef1f09 100644 --- a/fs/partitions/check.c +++ b/fs/partitions/check.c @@ -533,6 +533,7 @@ void del_gendisk(struct gendisk *disk) devfs_remove_disk(disk); + kobject_uevent(&disk->kobj, KOBJ_REMOVE); if (disk->holder_dir) kobject_unregister(disk->holder_dir); if (disk->slave_dir) @@ -545,7 +546,7 @@ void del_gendisk(struct gendisk *disk) kfree(disk_name); } put_device(disk->driverfs_dev); + disk->driverfs_dev = NULL; } - kobject_uevent(&disk->kobj, KOBJ_REMOVE); kobject_del(&disk->kobj); } ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 23:50 ` Linus Torvalds @ 2006-05-13 0:20 ` Al Viro 2006-05-24 14:07 ` Erik Mouw 1 sibling, 0 replies; 44+ messages in thread From: Al Viro @ 2006-05-13 0:20 UTC (permalink / raw) To: Linus Torvalds Cc: Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 04:50:32PM -0700, Linus Torvalds wrote: > > > On Sat, 13 May 2006, Al Viro wrote: > > > > BTW, the best option is to kill bdev_uevent() again. Short of that, > > skip PHYSDEV mess if disk doesn't have GENHD_FL_UP. > > I do think the mount/umount events are valid and interesting, so I'd much > rather see the second version. mount/umount events are BS, but that's a separate story. In case you've missed the original discussion: * mount/umount is not tied to block device; if you really want to track mount tree changes, you need much more; if you want to track the set of filesystems being active, you _still_ need more. * "block device is used by fs" is valid, but bdev_uevent() is _NOT_ that; we miss e.g. journals. * "block device is being claimed exclusively" - even more interesting, and again, bdev_uevent() isn't it. * "underlying object of _the_ block device used by a filesystem" is not well-defined, to put it mildly. IOW, it's a big mess. Note that if you care about real mount events ("set of mounts has changed") you can do that already, and get _all_ of them - including mount --move, etc. poll(2) on /proc/mounts will do that for you. And you don't need to be priveleged for that, while we are at it. That's what hald wanted bdev_uevent() for. All attempts to find what other users are really trying to get had not brought anything. I'm not saying that we don't need something similar, but let's try to make sure it makes _sense_ and isn't just "oh, my userland code rereads too often; guess if I would see when kernel reaches this, this and that point I would be able to reduce the frequency of rereads and wouldn't miss too much in the cases I've ran into so far". In the current form bdev_uevent() makes no sense. It pushes a lot of vaguely related stuff and doesn't cover _any_ sane use fully. So yeah, I'm seriously pissed off at the bdev_uevent() merge - at hald crowd for pushing it to gregkh, at gregkh for blindly passing it on and at myself for missing it back then. Note that as soon as hald folks _did_ explain what they really wanted, they've got a sane solution (and are using it now just fine). > However, that does beg the question: wouldn't that effectively be what the > patch I posted would do? Notably the "disk->driverfs_dev = NULL" part > after we've dropped it (the "KOBJ_REMOVE" event move is a separate issue, > mixed here into the same patch, but should result in possibly better name > generation for the event). > > Basically, onces driverfs_dev has been dropped, we NULL it out, and then > the people who use it automatically get the right result. More or less. We don't want to mess with refcounts at all; aside of that, yes, checking if it's NULL is OK. subsystem mutex should be enough, AFAICS. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 23:50 ` Linus Torvalds 2006-05-13 0:20 ` Al Viro @ 2006-05-24 14:07 ` Erik Mouw 1 sibling, 0 replies; 44+ messages in thread From: Erik Mouw @ 2006-05-24 14:07 UTC (permalink / raw) To: Linus Torvalds Cc: Al Viro, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 04:50:32PM -0700, Linus Torvalds wrote: > On Sat, 13 May 2006, Al Viro wrote: > > > > BTW, the best option is to kill bdev_uevent() again. Short of that, > > skip PHYSDEV mess if disk doesn't have GENHD_FL_UP. > > I do think the mount/umount events are valid and interesting, so I'd much > rather see the second version. > > However, that does beg the question: wouldn't that effectively be what the > patch I posted would do? Notably the "disk->driverfs_dev = NULL" part > after we've dropped it (the "KOBJ_REMOVE" event move is a separate issue, > mixed here into the same patch, but should result in possibly better name > generation for the event). > > Basically, onces driverfs_dev has been dropped, we NULL it out, and then > the people who use it automatically get the right result. > > Yes? No? "You're a total klutz, Linus, that patch won't actually do > anything, because <xyz>"? Just want to confirm that I can't recreate the SCSI slab error anymore with your patch (032ebf2620ef99a4fedaa0f77dc2272095ac5863) in the current -git kernel. Erik -- +-- Erik Mouw -- www.harddisk-recovery.nl -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 22:28 ` Al Viro 2006-05-12 22:48 ` Al Viro @ 2006-05-13 0:08 ` Greg KH 1 sibling, 0 replies; 44+ messages in thread From: Greg KH @ 2006-05-13 0:08 UTC (permalink / raw) To: Al Viro, Andrew Morton Cc: Linus Torvalds, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 11:28:16PM +0100, Al Viro wrote: > On Fri, May 12, 2006 at 03:22:42PM -0700, Linus Torvalds wrote: > > > > > > On Fri, 12 May 2006, Al Viro wrote: > > > > > > Secondary question: who had resurrected that crap? I distinctly remember > > > killing it off... > > > > If you did, I don't think it ever got into the kernel. > > > > It was added by Kay Sievers on Nov 3, 2004, according to the old history > > (back then it was in drivers/block/genhd.c, and the function was called > > "block_hotplug()", but apart from renaming the function and moving the > > file, it's recognizably the same. > > > > Of course, you may have killed off an even earlier incarnation.. > > Ah, right - there was another uevent mess in fs/super.c. Sorry, got them > confused... and that FPOS _is_ back. > > gregkh, may I ask why had bdev_uevent() been resurrected? One user, running an old version of HAL, on a distro that had an updated version available (Gentoo), refused to upgrade it and was upset that when they plugged a usb-storage device in, it did not show up as an icon on the desktop. Andrew pointed out that we broke the "don't break the userspace API" kernel rule, so I put the patch back, and noted when it would finally be removed. I would gladly remove it again if everyone can agree that they will not get upset at me again when someone running obsolete/broken userspace programs complains... thanks, greg k-h ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 22:22 ` Linus Torvalds 2006-05-12 22:28 ` Al Viro @ 2006-05-12 22:37 ` Russell King 1 sibling, 0 replies; 44+ messages in thread From: Russell King @ 2006-05-12 22:37 UTC (permalink / raw) To: Linus Torvalds Cc: Al Viro, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 03:22:42PM -0700, Linus Torvalds wrote: > On Fri, 12 May 2006, Al Viro wrote: > > Secondary question: who had resurrected that crap? I distinctly remember > > killing it off... > > If you did, I don't think it ever got into the kernel. > > It was added by Kay Sievers on Nov 3, 2004, according to the old history > (back then it was in drivers/block/genhd.c, and the function was called > "block_hotplug()", but apart from renaming the function and moving the > file, it's recognizably the same. > > Of course, you may have killed off an even earlier incarnation.. The changes in question are: commit fa675765afed59bb89adba3369094ebd428b930b tree 777a8c1bb48ef7de39073104f974209f4a462b6f parent b00dc3ad74fdb676552d46ee573b88e927240d0c author Greg Kroah-Hartman <gregkh@suse.de> Wed, 22 Feb 2006 09:39:02 -0800 committer Greg Kroah-Hartman <gregkh@suse.de> Wed, 22 Feb 2006 09:39:02 -0800 Revert mount/umount uevent removal This change reverts the 033b96fd30db52a710d97b06f87d16fc59fee0f1 commit from Kay Sievers that removed the mount/umount uevents from the kernel. Some older versions of HAL still depend on these events to detect when a new device has been mounted. These events are not correctly emitted, and are broken by design, and so, should not be relied upon by any future program. Instead, the /proc/mounts file should be polled to properly detect this kind of event. A feature-removal-schedule.txt entry has been added, noting when this interface will be removed from the kernel. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> commit 033b96fd30db52a710d97b06f87d16fc59fee0f1 tree 00fbccf2cf478307e213f298a221e330f3ba12ae parent 0f76e5acf9dc788e664056dda1e461f0bec93948 author Kay Sievers <kay.sievers@suse.de> 1131685795 +0100 committer Greg Kroah-Hartman <gregkh@suse.de> 1136420287 -0800 [PATCH] remove mount/umount uevents from superblock handling The names of these events have been confusing from the beginning on, as they have been more like claim/release events. We needed these events for noticing HAL if storage devices have been mounted. Thanks to Al, we have the proper solution now and can poll() /proc/mounts instead to get notfied about mount tree changes. Signed-off-by: Kay Sievers <kay.sievers@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache 2006-05-12 21:43 ` Al Viro 2006-05-12 21:55 ` Russell King @ 2006-05-12 21:58 ` Al Viro 1 sibling, 0 replies; 44+ messages in thread From: Al Viro @ 2006-05-12 21:58 UTC (permalink / raw) To: Linus Torvalds, Erik Mouw, Or Gerlitz, linux-scsi, axboe, Linux Kernel Mailing List On Fri, May 12, 2006 at 10:43:54PM +0100, Al Viro wrote: > On Fri, May 12, 2006 at 09:34:16PM +0100, Russell King wrote: > > On Fri, May 12, 2006 at 10:36:57AM -0700, Linus Torvalds wrote: > > > Yes. We could just revert that commit, but it seems correct, and I'd > > > really like for somebody to understand _why_ that commit matters at all. I > > > certainly don't see the overlap here.. > > > > Reverting the commit breaks MMC/SD in a very real way, and the fix > > is plainly correct and is actually the only possible fix that can be > > applied. > > Bullshit. Could you explain what generic code dereferences ->driverfs_dev > after del_gendisk()? If you see such beast, please tell; _that_ is the > real bug. Aha... So block_uevent() appears to be badly broken. Lovely... OK, could somebody explain WTF is userland supposed to do with event refering to device that had been gone for a long time? ^ permalink raw reply [flat|nested] 44+ messages in thread
end of thread, other threads:[~2006-05-24 14:07 UTC | newest]
Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20060511151456.GD3755@harddisk-recovery.com>
[not found] ` <15ddcffd0605112153q57f139a1k7068e204a3eeaf1f@mail.gmail.com>
2006-05-12 17:16 ` [BUG 2.6.17-git] kmem_cache_create: duplicate cache scsi_cmd_cache Erik Mouw
2006-05-12 17:27 ` Andrew Vasquez
2006-05-12 20:18 ` Erik Mouw
2006-05-12 17:36 ` Linus Torvalds
2006-05-12 17:47 ` James Bottomley
2006-05-12 18:58 ` James Bottomley
2006-05-12 19:09 ` Linus Torvalds
2006-05-12 20:38 ` Russell King
2006-05-12 20:50 ` Linus Torvalds
2006-05-12 20:58 ` Russell King
2006-05-12 21:27 ` Linus Torvalds
2006-05-12 21:46 ` Linus Torvalds
2006-05-12 21:51 ` Russell King
2006-05-12 22:15 ` Linus Torvalds
2006-05-12 22:37 ` Linus Torvalds
2006-05-12 23:57 ` Greg KH
2006-05-14 16:01 ` Russell King
2006-05-12 21:18 ` Greg KH
2006-05-12 21:32 ` Linus Torvalds
2006-05-12 22:45 ` Greg KH
2006-05-12 20:52 ` Greg KH
2006-05-12 21:03 ` Russell King
2006-05-12 21:34 ` Greg KH
2006-05-12 21:39 ` Al Viro
2006-05-12 20:30 ` Erik Mouw
2006-05-12 20:37 ` Russell King
2006-05-12 20:21 ` Erik Mouw
2006-05-12 20:34 ` Russell King
2006-05-12 21:43 ` Al Viro
2006-05-12 21:55 ` Russell King
2006-05-12 22:08 ` Al Viro
2006-05-12 22:22 ` Linus Torvalds
2006-05-12 22:28 ` Al Viro
2006-05-12 22:48 ` Al Viro
2006-05-12 22:51 ` Al Viro
2006-05-12 23:04 ` Linus Torvalds
2006-05-12 23:21 ` Al Viro
2006-05-12 23:37 ` Al Viro
2006-05-12 23:50 ` Linus Torvalds
2006-05-13 0:20 ` Al Viro
2006-05-24 14:07 ` Erik Mouw
2006-05-13 0:08 ` Greg KH
2006-05-12 22:37 ` Russell King
2006-05-12 21:58 ` Al Viro
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox