linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bad module reference counter
@ 2009-02-11  9:32 Stanislaw Gruszka
  2009-02-18 21:25 ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 7+ messages in thread
From: Stanislaw Gruszka @ 2009-02-11  9:32 UTC (permalink / raw)
  To: linux-ide

Hello.

I entered a problem with double decreasing module reference counter
where it become "negative", here is the usage scenario:

# modprobe at91_ide
# modprobe ide_gd_mod
# lsmod
Module                  Size  Used by    Not tainted
ide_gd_mod             22948  0
at91_ide                4672  0
ide_core               77020  2 ide_gd_mod,at91_ide
# rmmod ide_gd_mod
# lsmod
Module                  Size  Used by    Not tainted
at91_ide                4672 4294967295
ide_core               77020  1 at91_ide

Note when I first remove at91_ide module and then ide_gd_mod
everyting is ok.

I tired to debug issue and I did not found any suspicious in at91_ide.
I think probable reason is double free in ide-gd.c . Here is patch with
workaround (or maybe it is a real fix, but I'm not sure):

diff --git a/drivers/ide/ide-gd.c b/drivers/ide/ide-gd.c
index 7857b20..31ae04e 100644
--- a/drivers/ide/ide-gd.c
+++ b/drivers/ide/ide-gd.c
@@ -70,8 +70,6 @@ static void ide_gd_remove(ide_drive_t *drive)
 	del_gendisk(g);
 
 	drive->disk_ops->flush(drive);
-
-	ide_disk_put(idkp);
 }
 
 static void ide_disk_release(struct kref *kref)

If this patch is ok, maybe similar things need to be done also in ide-cd and
perhaps other device type modules.

Cheers
Stanislaw Gruszka

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Bad module reference counter
  2009-02-11  9:32 Bad module reference counter Stanislaw Gruszka
@ 2009-02-18 21:25 ` Bartlomiej Zolnierkiewicz
  2009-02-19 12:48   ` Stanislaw Gruszka
  0 siblings, 1 reply; 7+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2009-02-18 21:25 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: linux-ide

On Wednesday 11 February 2009, Stanislaw Gruszka wrote:
> Hello.
> 
> I entered a problem with double decreasing module reference counter
> where it become "negative", here is the usage scenario:
> 
> # modprobe at91_ide
> # modprobe ide_gd_mod
> # lsmod
> Module                  Size  Used by    Not tainted
> ide_gd_mod             22948  0
> at91_ide                4672  0
> ide_core               77020  2 ide_gd_mod,at91_ide
> # rmmod ide_gd_mod
> # lsmod
> Module                  Size  Used by    Not tainted
> at91_ide                4672 4294967295
> ide_core               77020  1 at91_ide
> 
> Note when I first remove at91_ide module and then ide_gd_mod
> everyting is ok.
> 
> I tired to debug issue and I did not found any suspicious in at91_ide.
> I think probable reason is double free in ide-gd.c . Here is patch with
> workaround (or maybe it is a real fix, but I'm not sure):
> 
> diff --git a/drivers/ide/ide-gd.c b/drivers/ide/ide-gd.c
> index 7857b20..31ae04e 100644
> --- a/drivers/ide/ide-gd.c
> +++ b/drivers/ide/ide-gd.c
> @@ -70,8 +70,6 @@ static void ide_gd_remove(ide_drive_t *drive)
>  	del_gendisk(g);
>  
>  	drive->disk_ops->flush(drive);
> -
> -	ide_disk_put(idkp);
>  }
>  
>  static void ide_disk_release(struct kref *kref)
> 
> If this patch is ok, maybe similar things need to be done also in ide-cd and
> perhaps other device type modules.

Seems like ide_device_put() needs the same module_refcount() check that
is present in scsi_device_put() so removal of device driver won't trigger
a spurious module_put() on a host driver?

Thanks,
Bart

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad module reference counter
  2009-02-18 21:25 ` Bartlomiej Zolnierkiewicz
@ 2009-02-19 12:48   ` Stanislaw Gruszka
  2009-02-19 16:49     ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 7+ messages in thread
From: Stanislaw Gruszka @ 2009-02-19 12:48 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: linux-ide, linux-scsi

Wednesday 18 February 2009 22:25:19 Bartlomiej Zolnierkiewicz napisał(a):
> > I entered a problem with double decreasing module reference counter
> > where it become "negative", here is the usage scenario:
> > 
> > # modprobe at91_ide
> > # modprobe ide_gd_mod
> > # lsmod
> > Module                  Size  Used by    Not tainted
> > ide_gd_mod             22948  0
> > at91_ide                4672  0
> > ide_core               77020  2 ide_gd_mod,at91_ide
> > # rmmod ide_gd_mod
> > # lsmod
> > Module                  Size  Used by    Not tainted
> > at91_ide                4672 4294967295
> > ide_core               77020  1 at91_ide
> > 
> > Note when I first remove at91_ide module and then ide_gd_mod
> > everyting is ok.
> > 
> > I tired to debug issue and I did not found any suspicious in at91_ide.
> > I think probable reason is double free in ide-gd.c . Here is patch with
> > workaround (or maybe it is a real fix, but I'm not sure):
> > 
> > diff --git a/drivers/ide/ide-gd.c b/drivers/ide/ide-gd.c
> > index 7857b20..31ae04e 100644
> > --- a/drivers/ide/ide-gd.c
> > +++ b/drivers/ide/ide-gd.c
> > @@ -70,8 +70,6 @@ static void ide_gd_remove(ide_drive_t *drive)
> >  	del_gendisk(g);
> >  
> >  	drive->disk_ops->flush(drive);
> > -
> > -	ide_disk_put(idkp);
> >  }
> >  
> >  static void ide_disk_release(struct kref *kref)
> > 
> > If this patch is ok, maybe similar things need to be done also in ide-cd and
> > perhaps other device type modules.
> 
> Seems like ide_device_put() needs the same module_refcount() check that
> is present in scsi_device_put() so removal of device driver won't trigger
> a spurious module_put() on a host driver?

I little surprise about scsi code (linux-scsi ML CC). Is comment inside
scsi_device_put() function correct? Why scsi_device_get() not check
try_module_get() return value? And most importand: there is reference
counter check before put, so it can be 0, but data does it protect is in
use ?

Adding module_refcount() != 0 to ide_device_put()  helps only partially, below
commands sequence give oops [1].

# modprobe at91_ide
# modprobe ide_gd_mod
# rmmod ide_gd_mod
# modprobe ide_gd_mod
# rmmod at91_ide

Oops happens because previous "rmmod ide_gd_mod"  decrease some reference
counter in ide_device_put() and in "rmmod at91_ide" function del_gendisk()
cause call to drive_release_dev(), which free drive->id before ide_disk_flush() .
This function oops with NULL driver->id.

There is no oops with my workaround, when I just remove ide_disk_put() from
ide_gd_remove(). It's strange why there is lack of symmetrical _put/_get calls,
ide_gd_probe() has no call to ide_disk_get(). 
  
Cheers
Stanislaw Gruszka

[1]:

[ 5043.790000] Unable to handle kernel NULL pointer dereference at virtual address 000000a6
[ 5043.800000] pgd = c3a40000
[ 5043.800000] [000000a6] *pgd=23b55031, *pte=00000000, *ppte=00000000
[ 5043.810000] Internal error: Oops: 17 [#1]
[ 5043.810000] Modules linked in: ide_gd_mod at91_ide(-) ide_core [last unloaded: ide_gd_mod]
[ 5043.810000] CPU: 0    Not tainted  (2.6.29-rc3 #34)
[ 5043.810000] PC is at ide_disk_flush+0x18/0xe4 [ide_gd_mod]
[ 5043.810000] LR is at ide_gd_remove+0x34/0x40 [ide_gd_mod]
[ 5043.810000] pc : [<bf035c6c>]    lr : [<bf035320>]    psr: 80000013
[ 5043.810000] sp : c3bb1d74  ip : c3bb1db0  fp : c3bb1dac
[ 5043.810000] r10: 0000b8e8  r9 : c3bb0000  r8 : c0028f24
[ 5043.810000] r7 : c3b9cde0  r6 : c3a8de00  r5 : c3b57a00  r4 : c3b98000
[ 5043.810000] r3 : bf037ea0  r2 : 00000000  r1 : 00000051  r0 : c3b98000
[ 5043.810000] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[ 5043.810000] Control: 0005317f  Table: 23a40000  DAC: 00000015
[ 5043.810000] Process rmmod (pid: 11458, stack limit = 0xc3bb0260)
[ 5043.810000] Stack: (0xc3bb1d74 to 0xc3bb2000)
{snip binary stack}
[ 5043.810000] Backtrace:
[ 5043.810000] [<bf035c54>] (ide_disk_flush+0x0/0xe4 [ide_gd_mod]) from [<bf035320>] (ide_gd_remove+0x34/0x40 [ide_gd_mod])
[ 5043.810000]  r4:c3b98000
[ 5043.810000] [<bf0352ec>] (ide_gd_remove+0x0/0x40 [ide_gd_mod]) from [<bf00b198>] (generic_ide_remove+0x24/0x2c [ide_core])
[ 5043.810000]  r6:c3b984f8 r5:bf03a744 r4:c3b98090
[ 5043.810000] [<bf00b174>] (generic_ide_remove+0x0/0x2c [ide_core]) from [<c0170a38>] (__device_release_driver+0x6c/0x88)
[ 5043.810000] [<c01709cc>] (__device_release_driver+0x0/0x88) from [<c0170b08>] (device_release_driver+0x24/0x30)
[ 5043.810000]  r5:c3b98118 r4:c3b98090
[ 5043.810000] [<c0170ae4>] (device_release_driver+0x0/0x30) from [<c016fd68>] (bus_remove_device+0x80/0x94)
[ 5043.810000]  r5:c3b98090 r4:c3b980c0
[ 5043.810000] [<c016fce8>] (bus_remove_device+0x0/0x94) from [<c016e58c>] (device_del+0x104/0x154)
[ 5043.810000]  r5:c3b98404 r4:c3b98090
[ 5043.810000] [<c016e488>] (device_del+0x0/0x154) from [<c016e5f0>] (device_unregister+0x14/0x20)
[ 5043.810000]  r6:00000001 r5:c3b98404 r4:c3b98090
[ 5043.810000] [<c016e5dc>] (device_unregister+0x0/0x20) from [<bf010614>] (__ide_port_unregister_devices+0x30/0x54 [ide_core])
[ 5043.810000]  r4:c3b98000
[ 5043.810000] [<bf0105e4>] (__ide_port_unregister_devices+0x0/0x54 [ide_core]) from [<bf0106c8>] (ide_host_remove+0x70/0x108 [ide_core])
[ 5043.810000]  r6:00000000 r5:c3bb0000 r4:c3b98400
[ 5043.810000] [<bf010658>] (ide_host_remove+0x0/0x108 [ide_core]) from [<bf023634>] (at91_ide_remove+0x14/0x1c [at91_ide])
[ 5043.810000]  r7:00000880 r6:bf024054 r5:bf024054 r4:c02f3500
[ 5043.810000] [<bf023620>] (at91_ide_remove+0x0/0x1c [at91_ide]) from [<c0171b40>] (platform_drv_remove+0x20/0x24)
[ 5043.810000] [<c0171b20>] (platform_drv_remove+0x0/0x24) from [<c0170a38>] (__device_release_driver+0x6c/0x88)
[ 5043.810000] [<c01709cc>] (__device_release_driver+0x0/0x88) from [<c0170abc>] (driver_detach+0x68/0x90)
[ 5043.810000]  r5:c02f3588 r4:c02f3500
[ 5043.810000] [<c0170a54>] (driver_detach+0x0/0x90) from [<c016fc40>] (bus_remove_driver+0x8c/0xb4)
[ 5043.810000]  r6:c0307320 r5:bf0240a0 r4:bf024054
[ 5043.810000] [<c016fbb4>] (bus_remove_driver+0x0/0xb4) from [<c0170f54>] (driver_unregister+0x44/0x48)
[ 5043.810000]  r6:00000000 r5:bf0240a0 r4:bf024054
[ 5043.810000] [<c0170f10>] (driver_unregister+0x0/0x48) from [<c0171ce0>] (platform_driver_unregister+0x14/0x18)
[ 5043.810000]  r6:bf0241c0 r5:bf0240a0 r4:00000000
[ 5043.810000] [<c0171ccc>] (platform_driver_unregister+0x0/0x18) from [<bf023618>] (at91_ide_exit+0x14/0x1c [at91_ide])
[ 5043.810000] [<bf023604>] (at91_ide_exit+0x0/0x1c [at91_ide]) from [<c005b850>] (sys_delete_module+0x1b8/0x230)
[ 5043.810000] [<c005b698>] (sys_delete_module+0x0/0x230) from [<c0028d80>] (ret_fast_syscall+0x0/0x2c)
[ 5043.810000]  r7:00000081 r6:becd1bcc r5:00000880 r4:becd1cd8
[ 5043.810000] Code: e24cb004 e24dd028 e590201c e1a04000 (e1d21ab6)
[ 5044.320000] ---[ end trace 120de1a999313176 ]---

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad module reference counter
  2009-02-19 12:48   ` Stanislaw Gruszka
@ 2009-02-19 16:49     ` Bartlomiej Zolnierkiewicz
  2009-02-20 10:45       ` Stanislaw Gruszka
  0 siblings, 1 reply; 7+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2009-02-19 16:49 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: linux-ide, linux-scsi

On Thursday 19 February 2009, Stanislaw Gruszka wrote:
> Wednesday 18 February 2009 22:25:19 Bartlomiej Zolnierkiewicz napisał(a):
> > > I entered a problem with double decreasing module reference counter
> > > where it become "negative", here is the usage scenario:
> > > 
> > > # modprobe at91_ide
> > > # modprobe ide_gd_mod
> > > # lsmod
> > > Module                  Size  Used by    Not tainted
> > > ide_gd_mod             22948  0
> > > at91_ide                4672  0
> > > ide_core               77020  2 ide_gd_mod,at91_ide
> > > # rmmod ide_gd_mod
> > > # lsmod
> > > Module                  Size  Used by    Not tainted
> > > at91_ide                4672 4294967295
> > > ide_core               77020  1 at91_ide
> > > 
> > > Note when I first remove at91_ide module and then ide_gd_mod
> > > everyting is ok.
> > > 
> > > I tired to debug issue and I did not found any suspicious in at91_ide.
> > > I think probable reason is double free in ide-gd.c . Here is patch with
> > > workaround (or maybe it is a real fix, but I'm not sure):
> > > 
> > > diff --git a/drivers/ide/ide-gd.c b/drivers/ide/ide-gd.c
> > > index 7857b20..31ae04e 100644
> > > --- a/drivers/ide/ide-gd.c
> > > +++ b/drivers/ide/ide-gd.c
> > > @@ -70,8 +70,6 @@ static void ide_gd_remove(ide_drive_t *drive)
> > >  	del_gendisk(g);
> > >  
> > >  	drive->disk_ops->flush(drive);
> > > -
> > > -	ide_disk_put(idkp);
> > >  }
> > >  
> > >  static void ide_disk_release(struct kref *kref)
> > > 
> > > If this patch is ok, maybe similar things need to be done also in ide-cd and
> > > perhaps other device type modules.
> > 
> > Seems like ide_device_put() needs the same module_refcount() check that
> > is present in scsi_device_put() so removal of device driver won't trigger
> > a spurious module_put() on a host driver?
> 
> I little surprise about scsi code (linux-scsi ML CC). Is comment inside
> scsi_device_put() function correct? Why scsi_device_get() not check
> try_module_get() return value? And most importand: there is reference
> counter check before put, so it can be 0, but data does it protect is in
> use ?
> 
> Adding module_refcount() != 0 to ide_device_put()  helps only partially, below
> commands sequence give oops [1].
> 
> # modprobe at91_ide
> # modprobe ide_gd_mod
> # rmmod ide_gd_mod
> # modprobe ide_gd_mod
> # rmmod at91_ide
> 
> Oops happens because previous "rmmod ide_gd_mod"  decrease some reference
> counter in ide_device_put() and in "rmmod at91_ide" function del_gendisk()
> cause call to drive_release_dev(), which free drive->id before ide_disk_flush() .
> This function oops with NULL driver->id.

Uh... we will need some more intrusive changes to the reference counting
to fix it -- like to replace idkp->kref by idkp->dev and make drive->gendev
a parent of it (so only after the final put on ->dev ->gendev can go away).

[ IOW we need to have some changes similar to those done in sd.c by:
	commit 6bdaa1f17dd32ec62345c7b57842f53e6278a2fa
  and later by:
	commit ee959b00c335d7780136c5abda37809191fe52c3 ]

> There is no oops with my workaround, when I just remove ide_disk_put() from

I suppose that after ide_disk_put() removal ide_disk_release() is simply
never called... ;)

> ide_gd_remove(). It's strange why there is lack of symmetrical _put/_get calls,
> ide_gd_probe() has no call to ide_disk_get(). 

We have kref_init() in ide_disk_probe(), so there is no need for it
and we also don't want to hold an extra reference on host driver...

Thanks,
Bart
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad module reference counter
  2009-02-19 16:49     ` Bartlomiej Zolnierkiewicz
@ 2009-02-20 10:45       ` Stanislaw Gruszka
  2009-02-23 22:36         ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 7+ messages in thread
From: Stanislaw Gruszka @ 2009-02-20 10:45 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: linux-ide, linux-scsi

Thursday 19 February 2009 17:49:52 Bartlomiej Zolnierkiewicz napisał(a):
> > > Seems like ide_device_put() needs the same module_refcount() check that
> > > is present in scsi_device_put() so removal of device driver won't trigger
> > > a spurious module_put() on a host driver?
> > 
> > I little surprise about scsi code (linux-scsi ML CC). Is comment inside
> > scsi_device_put() function correct? Why scsi_device_get() not check
> > try_module_get() return value? And most importand: there is reference
> > counter check before put, so it can be 0, but data does it protect is in
> > use ?

Any comments?

> Uh... we will need some more intrusive changes to the reference counting
> to fix it -- like to replace idkp->kref by idkp->dev and make drive->gendev
> a parent of it (so only after the final put on ->dev ->gendev can go away).
> 
> [ IOW we need to have some changes similar to those done in sd.c by:
> 	commit 6bdaa1f17dd32ec62345c7b57842f53e6278a2fa
>   and later by:
> 	commit ee959b00c335d7780136c5abda37809191fe52c3 ]
> 
> > There is no oops with my workaround, when I just remove ide_disk_put() from
>
> I suppose that after ide_disk_put() removal ide_disk_release() is simply
> never called... ;)
> 
> > ide_gd_remove(). It's strange why there is lack of symmetrical _put/_get calls,
> > ide_gd_probe() has no call to ide_disk_get(). 
> 
> We have kref_init() in ide_disk_probe(), so there is no need for it
> and we also don't want to hold an extra reference on host driver...

Looks that using ->dev insted of ->kref will do the work. But perhaps less
intrusive fix, like check kref in ide_disk_put() would be better solution.
I tested below patch and everythings is fine.

diff --git a/drivers/ide/ide-gd.c b/drivers/ide/ide-gd.c
index 7857b20..598f21b 100644
--- a/drivers/ide/ide-gd.c
+++ b/drivers/ide/ide-gd.c
@@ -48,8 +48,8 @@ static void ide_disk_put(struct ide_disk_obj *idkp)
 	ide_drive_t *drive = idkp->drive;
 
 	mutex_lock(&ide_disk_ref_mutex);
-	kref_put(&idkp->kref, ide_disk_release);
-	ide_device_put(drive);
+	if (!kref_put(&idkp->kref, ide_disk_release))
+		ide_device_put(drive);
 	mutex_unlock(&ide_disk_ref_mutex);
 }
 
If this patch is ok and dropping kref to dev is not planed currently, maybe
I'll send "official" patch with ide-gd fix and for other devices types.

Regards
Stanislaw Gruszka
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Bad module reference counter
  2009-02-20 10:45       ` Stanislaw Gruszka
@ 2009-02-23 22:36         ` Bartlomiej Zolnierkiewicz
  2009-02-25 11:00           ` Stanislaw Gruszka
  0 siblings, 1 reply; 7+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2009-02-23 22:36 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: linux-ide, linux-scsi

On Friday 20 February 2009, Stanislaw Gruszka wrote:
> Thursday 19 February 2009 17:49:52 Bartlomiej Zolnierkiewicz napisał(a):
> > > > Seems like ide_device_put() needs the same module_refcount() check that
> > > > is present in scsi_device_put() so removal of device driver won't trigger
> > > > a spurious module_put() on a host driver?
> > > 
> > > I little surprise about scsi code (linux-scsi ML CC). Is comment inside
> > > scsi_device_put() function correct? Why scsi_device_get() not check
> > > try_module_get() return value? And most importand: there is reference
> > > counter check before put, so it can be 0, but data does it protect is in
> > > use ?
> 
> Any comments?
> 
> > Uh... we will need some more intrusive changes to the reference counting
> > to fix it -- like to replace idkp->kref by idkp->dev and make drive->gendev
> > a parent of it (so only after the final put on ->dev ->gendev can go away).
> > 
> > [ IOW we need to have some changes similar to those done in sd.c by:
> > 	commit 6bdaa1f17dd32ec62345c7b57842f53e6278a2fa
> >   and later by:
> > 	commit ee959b00c335d7780136c5abda37809191fe52c3 ]
> > 
> > > There is no oops with my workaround, when I just remove ide_disk_put() from
> >
> > I suppose that after ide_disk_put() removal ide_disk_release() is simply
> > never called... ;)
> > 
> > > ide_gd_remove(). It's strange why there is lack of symmetrical _put/_get calls,
> > > ide_gd_probe() has no call to ide_disk_get(). 
> > 
> > We have kref_init() in ide_disk_probe(), so there is no need for it
> > and we also don't want to hold an extra reference on host driver...
> 
> Looks that using ->dev insted of ->kref will do the work. But perhaps less
> intrusive fix, like check kref in ide_disk_put() would be better solution.
> I tested below patch and everythings is fine.
> 
> diff --git a/drivers/ide/ide-gd.c b/drivers/ide/ide-gd.c
> index 7857b20..598f21b 100644
> --- a/drivers/ide/ide-gd.c
> +++ b/drivers/ide/ide-gd.c
> @@ -48,8 +48,8 @@ static void ide_disk_put(struct ide_disk_obj *idkp)
>  	ide_drive_t *drive = idkp->drive;
>  
>  	mutex_lock(&ide_disk_ref_mutex);
> -	kref_put(&idkp->kref, ide_disk_release);
> -	ide_device_put(drive);
> +	if (!kref_put(&idkp->kref, ide_disk_release))
> +		ide_device_put(drive);

I worry that this just masks the problem as according to your previous
mail drive still can be already gone before ->flush in ide_gd_remove().

>  	mutex_unlock(&ide_disk_ref_mutex);
>  }
>  
> If this patch is ok and dropping kref to dev is not planed currently, maybe
> I'll send "official" patch with ide-gd fix and for other devices types.

Lets fix it fully.  The below patch together with previous ide_device_put()
fix and drive_release_dev() one (from another mail) should make all problems
go away...

From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Subject: [PATCH] ide: fix refcounting in device drivers

During host driver module removal del_gendisk() results in a final
put on drive->gendev and freeing the drive by drive_release_dev().

Convert device drivers from using struct kref to use struct device
so device driver's object holds reference on ->gendev and prevents
drive from prematurely going away.

Also fix ->remove methods to not erroneously drop reference on a
host driver by using only put_device() instead of ide*_put().

Reported-by: Stanislaw Gruszka <stf_xl@wp.pl>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
---
 drivers/ide/ide-cd.c   |   27 ++++++++++++++++++---------
 drivers/ide/ide-cd.h   |    2 +-
 drivers/ide/ide-gd.c   |   26 +++++++++++++++++---------
 drivers/ide/ide-gd.h   |    2 +-
 drivers/ide/ide-tape.c |   29 +++++++++++++++++++----------
 include/linux/ide.h    |    2 +-
 6 files changed, 57 insertions(+), 31 deletions(-)

Index: b/drivers/ide/ide-cd.c
===================================================================
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@ -55,7 +55,7 @@
 
 static DEFINE_MUTEX(idecd_ref_mutex);
 
-static void ide_cd_release(struct kref *);
+static void ide_cd_release(struct device *);
 
 static struct cdrom_info *ide_cd_get(struct gendisk *disk)
 {
@@ -67,7 +67,7 @@ static struct cdrom_info *ide_cd_get(str
 		if (ide_device_get(cd->drive))
 			cd = NULL;
 		else
-			kref_get(&cd->kref);
+			get_device(&cd->dev);
 
 	}
 	mutex_unlock(&idecd_ref_mutex);
@@ -79,7 +79,7 @@ static void ide_cd_put(struct cdrom_info
 	ide_drive_t *drive = cd->drive;
 
 	mutex_lock(&idecd_ref_mutex);
-	kref_put(&cd->kref, ide_cd_release);
+	put_device(&cd->dev);
 	ide_device_put(drive);
 	mutex_unlock(&idecd_ref_mutex);
 }
@@ -1790,15 +1790,17 @@ static void ide_cd_remove(ide_drive_t *d
 	ide_debug_log(IDE_DBG_FUNC, "Call %s\n", __func__);
 
 	ide_proc_unregister_driver(drive, info->driver);
-
+	device_del(&info->dev);
 	del_gendisk(info->disk);
 
-	ide_cd_put(info);
+	mutex_lock(&idecd_ref_mutex);
+	put_device(&info->dev);
+	mutex_unlock(&idecd_ref_mutex);
 }
 
-static void ide_cd_release(struct kref *kref)
+static void ide_cd_release(struct device *dev)
 {
-	struct cdrom_info *info = to_ide_drv(kref, cdrom_info);
+	struct cdrom_info *info = to_ide_drv(dev, cdrom_info);
 	struct cdrom_device_info *devinfo = &info->devinfo;
 	ide_drive_t *drive = info->drive;
 	struct gendisk *g = info->disk;
@@ -1997,7 +1999,12 @@ static int ide_cd_probe(ide_drive_t *dri
 
 	ide_init_disk(g, drive);
 
-	kref_init(&info->kref);
+	info->dev.parent = &drive->gendev;
+	info->dev.release = ide_cd_release;
+	dev_set_name(&info->dev, dev_name(&drive->gendev));
+
+	if (device_register(&info->dev))
+		goto out_free_disk;
 
 	info->drive = drive;
 	info->driver = &ide_cdrom_driver;
@@ -2011,7 +2018,7 @@ static int ide_cd_probe(ide_drive_t *dri
 	g->driverfs_dev = &drive->gendev;
 	g->flags = GENHD_FL_CD | GENHD_FL_REMOVABLE;
 	if (ide_cdrom_setup(drive)) {
-		ide_cd_release(&info->kref);
+		put_device(&info->dev);
 		goto failed;
 	}
 
@@ -2021,6 +2028,8 @@ static int ide_cd_probe(ide_drive_t *dri
 	add_disk(g);
 	return 0;
 
+out_free_disk:
+	put_disk(g);
 out_free_cd:
 	kfree(info);
 failed:
Index: b/drivers/ide/ide-cd.h
===================================================================
--- a/drivers/ide/ide-cd.h
+++ b/drivers/ide/ide-cd.h
@@ -80,7 +80,7 @@ struct cdrom_info {
 	ide_drive_t		*drive;
 	struct ide_driver	*driver;
 	struct gendisk		*disk;
-	struct kref		kref;
+	struct device		dev;
 
 	/* Buffer for table of contents.  NULL if we haven't allocated
 	   a TOC buffer for this device yet. */
Index: b/drivers/ide/ide-gd.c
===================================================================
--- a/drivers/ide/ide-gd.c
+++ b/drivers/ide/ide-gd.c
@@ -25,7 +25,7 @@ module_param(debug_mask, ulong, 0644);
 
 static DEFINE_MUTEX(ide_disk_ref_mutex);
 
-static void ide_disk_release(struct kref *);
+static void ide_disk_release(struct device *);
 
 static struct ide_disk_obj *ide_disk_get(struct gendisk *disk)
 {
@@ -37,7 +37,7 @@ static struct ide_disk_obj *ide_disk_get
 		if (ide_device_get(idkp->drive))
 			idkp = NULL;
 		else
-			kref_get(&idkp->kref);
+			get_device(&idkp->dev);
 	}
 	mutex_unlock(&ide_disk_ref_mutex);
 	return idkp;
@@ -48,7 +48,7 @@ static void ide_disk_put(struct ide_disk
 	ide_drive_t *drive = idkp->drive;
 
 	mutex_lock(&ide_disk_ref_mutex);
-	kref_put(&idkp->kref, ide_disk_release);
+	put_device(&idkp->dev);
 	ide_device_put(drive);
 	mutex_unlock(&ide_disk_ref_mutex);
 }
@@ -66,17 +66,18 @@ static void ide_gd_remove(ide_drive_t *d
 	struct gendisk *g = idkp->disk;
 
 	ide_proc_unregister_driver(drive, idkp->driver);
-
+	device_del(&idkp->dev);
 	del_gendisk(g);
-
 	drive->disk_ops->flush(drive);
 
-	ide_disk_put(idkp);
+	mutex_lock(&ide_disk_ref_mutex);
+	put_device(&idkp->dev);
+	mutex_unlock(&ide_disk_ref_mutex);
 }
 
-static void ide_disk_release(struct kref *kref)
+static void ide_disk_release(struct device *dev)
 {
-	struct ide_disk_obj *idkp = to_ide_drv(kref, ide_disk_obj);
+	struct ide_disk_obj *idkp = to_ide_drv(dev, ide_disk_obj);
 	ide_drive_t *drive = idkp->drive;
 	struct gendisk *g = idkp->disk;
 
@@ -348,7 +349,12 @@ static int ide_gd_probe(ide_drive_t *dri
 
 	ide_init_disk(g, drive);
 
-	kref_init(&idkp->kref);
+	idkp->dev.parent = &drive->gendev;
+	idkp->dev.release = ide_disk_release;
+	dev_set_name(&idkp->dev, dev_name(&drive->gendev));
+
+	if (device_register(&idkp->dev))
+		goto out_free_disk;
 
 	idkp->drive = drive;
 	idkp->driver = &ide_gd_driver;
@@ -373,6 +379,8 @@ static int ide_gd_probe(ide_drive_t *dri
 	add_disk(g);
 	return 0;
 
+out_free_disk:
+	put_disk(g);
 out_free_idkp:
 	kfree(idkp);
 failed:
Index: b/drivers/ide/ide-gd.h
===================================================================
--- a/drivers/ide/ide-gd.h
+++ b/drivers/ide/ide-gd.h
@@ -17,7 +17,7 @@ struct ide_disk_obj {
 	ide_drive_t		*drive;
 	struct ide_driver	*driver;
 	struct gendisk		*disk;
-	struct kref		kref;
+	struct device		dev;
 	unsigned int		openers;	/* protected by BKL for now */
 
 	/* Last failed packet command */
Index: b/drivers/ide/ide-tape.c
===================================================================
--- a/drivers/ide/ide-tape.c
+++ b/drivers/ide/ide-tape.c
@@ -169,7 +169,7 @@ typedef struct ide_tape_obj {
 	ide_drive_t		*drive;
 	struct ide_driver	*driver;
 	struct gendisk		*disk;
-	struct kref		kref;
+	struct device		dev;
 
 	/*
 	 *	failed_pc points to the last failed packet command, or contains
@@ -267,7 +267,7 @@ static DEFINE_MUTEX(idetape_ref_mutex);
 
 static struct class *idetape_sysfs_class;
 
-static void ide_tape_release(struct kref *);
+static void ide_tape_release(struct device *);
 
 static struct ide_tape_obj *ide_tape_get(struct gendisk *disk)
 {
@@ -279,7 +279,7 @@ static struct ide_tape_obj *ide_tape_get
 		if (ide_device_get(tape->drive))
 			tape = NULL;
 		else
-			kref_get(&tape->kref);
+			get_device(&tape->dev);
 	}
 	mutex_unlock(&idetape_ref_mutex);
 	return tape;
@@ -290,7 +290,7 @@ static void ide_tape_put(struct ide_tape
 	ide_drive_t *drive = tape->drive;
 
 	mutex_lock(&idetape_ref_mutex);
-	kref_put(&tape->kref, ide_tape_release);
+	put_device(&tape->dev);
 	ide_device_put(drive);
 	mutex_unlock(&idetape_ref_mutex);
 }
@@ -308,7 +308,7 @@ static struct ide_tape_obj *ide_tape_chr
 	mutex_lock(&idetape_ref_mutex);
 	tape = idetape_devs[i];
 	if (tape)
-		kref_get(&tape->kref);
+		get_device(&tape->dev);
 	mutex_unlock(&idetape_ref_mutex);
 	return tape;
 }
@@ -2256,15 +2256,17 @@ static void ide_tape_remove(ide_drive_t 
 	idetape_tape_t *tape = drive->driver_data;
 
 	ide_proc_unregister_driver(drive, tape->driver);
-
+	device_del(&tape->dev);
 	ide_unregister_region(tape->disk);
 
-	ide_tape_put(tape);
+	mutex_lock(&idetape_ref_mutex);
+	put_device(&tape->dev);
+	mutex_unlock(&idetape_ref_mutex);
 }
 
-static void ide_tape_release(struct kref *kref)
+static void ide_tape_release(struct device *dev)
 {
-	struct ide_tape_obj *tape = to_ide_drv(kref, ide_tape_obj);
+	struct ide_tape_obj *tape = to_ide_drv(dev, ide_tape_obj);
 	ide_drive_t *drive = tape->drive;
 	struct gendisk *g = tape->disk;
 
@@ -2407,7 +2409,12 @@ static int ide_tape_probe(ide_drive_t *d
 
 	ide_init_disk(g, drive);
 
-	kref_init(&tape->kref);
+	tape->dev.parent = &drive->gendev;
+	tape->dev.release = ide_tape_release;
+	dev_set_name(&tape->dev, dev_name(&drive->gendev));
+
+	if (device_register(&tape->dev))
+		goto out_free_disk;
 
 	tape->drive = drive;
 	tape->driver = &idetape_driver;
@@ -2436,6 +2443,8 @@ static int ide_tape_probe(ide_drive_t *d
 
 	return 0;
 
+out_free_disk:
+	put_disk(g);
 out_free_tape:
 	kfree(tape);
 failed:
Index: b/include/linux/ide.h
===================================================================
--- a/include/linux/ide.h
+++ b/include/linux/ide.h
@@ -663,7 +663,7 @@ typedef struct ide_drive_s ide_drive_t;
 #define to_ide_device(dev)		container_of(dev, ide_drive_t, gendev)
 
 #define to_ide_drv(obj, cont_type)	\
-	container_of(obj, struct cont_type, kref)
+	container_of(obj, struct cont_type, dev)
 
 #define ide_drv_g(disk, cont_type)	\
 	container_of((disk)->private_data, struct cont_type, driver)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad module reference counter
  2009-02-23 22:36         ` Bartlomiej Zolnierkiewicz
@ 2009-02-25 11:00           ` Stanislaw Gruszka
  0 siblings, 0 replies; 7+ messages in thread
From: Stanislaw Gruszka @ 2009-02-25 11:00 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: linux-ide, linux-scsi

Monday 23 February 2009 23:36:35 Bartlomiej Zolnierkiewicz napisał(a):
> > Looks that using ->dev insted of ->kref will do the work. But perhaps less
> > intrusive fix, like check kref in ide_disk_put() would be better solution.
> > I tested below patch and everythings is fine.
> > 
> > diff --git a/drivers/ide/ide-gd.c b/drivers/ide/ide-gd.c
> > index 7857b20..598f21b 100644
> > --- a/drivers/ide/ide-gd.c
> > +++ b/drivers/ide/ide-gd.c
> > @@ -48,8 +48,8 @@ static void ide_disk_put(struct ide_disk_obj *idkp)
> >  	ide_drive_t *drive = idkp->drive;
> >  
> >  	mutex_lock(&ide_disk_ref_mutex);
> > -	kref_put(&idkp->kref, ide_disk_release);
> > -	ide_device_put(drive);
> > +	if (!kref_put(&idkp->kref, ide_disk_release))
> > +		ide_device_put(drive);
> 
> I worry that this just masks the problem as according to your previous
> mail drive still can be already gone before ->flush in ide_gd_remove().

I checked my patch against all mentioned problems, but it doesn't matter now.
 
> >  	mutex_unlock(&ide_disk_ref_mutex);
> >  }
> >  
> > If this patch is ok and dropping kref to dev is not planed currently, maybe
> > I'll send "official" patch with ide-gd fix and for other devices types.
> 
> Lets fix it fully.  The below patch together with previous ide_device_put()
> fix and drive_release_dev() one (from another mail) should make all problems
> go away...
> 
> From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
> Subject: [PATCH] ide: fix refcounting in device drivers
> 
> During host driver module removal del_gendisk() results in a final
> put on drive->gendev and freeing the drive by drive_release_dev().
> 
> Convert device drivers from using struct kref to use struct device
> so device driver's object holds reference on ->gendev and prevents
> drive from prematurely going away.
> 
> Also fix ->remove methods to not erroneously drop reference on a
> host driver by using only put_device() instead of ide*_put().
> 
> Reported-by: Stanislaw Gruszka <stf_xl@wp.pl>
> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

Tested-by: Stanislaw Gruszka <stf_xl@wp.pl>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-02-25 11:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-11  9:32 Bad module reference counter Stanislaw Gruszka
2009-02-18 21:25 ` Bartlomiej Zolnierkiewicz
2009-02-19 12:48   ` Stanislaw Gruszka
2009-02-19 16:49     ` Bartlomiej Zolnierkiewicz
2009-02-20 10:45       ` Stanislaw Gruszka
2009-02-23 22:36         ` Bartlomiej Zolnierkiewicz
2009-02-25 11:00           ` Stanislaw Gruszka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).