linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Race-free NAND device removal
@ 2016-07-03 13:38 Richard Weinberger
  2016-07-04  9:16 ` Boris Brezillon
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Weinberger @ 2016-07-03 13:38 UTC (permalink / raw)
  To: linux-mtd@lists.infradead.org; +Cc: Boris Brezillon, Brian Norris

Hi!

While working on nandsim I realized that nand_release() ignores the return
value from mtd_device_unregister().

That means NAND devices cannot removed in a race-free manner.
Consider a NAND driver that registers ->_get_device() and ->_put_device()
callbacks for refcounting. In its removal function it will return -EBUSY
whenever the refcount is > 0.
But when device is claimed while removing it, it can happen that the refcount
increments after the check.
MTD can deal with that and mtd_device_unregister() will return EBUSY.
But nand_release() won't notice and the NAND driver continues with the tear down
process.

Would be a change like the following one acceptable or is a NAND driver
allowed to call mtd_device_unregister() itself?
AFAICT the additional call to mtd_device_unregister() in nand_release() would
be an nop then.

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 0b0dc29..dc76bc6 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -4604,16 +4604,19 @@ EXPORT_SYMBOL(nand_scan);
  * nand_release - [NAND Interface] Free resources held by the NAND device
  * @mtd: MTD device structure
  */
-void nand_release(struct mtd_info *mtd)
+int nand_release(struct mtd_info *mtd)
 {
+	int ret;
 	struct nand_chip *chip = mtd_to_nand(mtd);

+	ret = mtd_device_unregister(mtd);
+	if (ret)
+		return ret;
+
 	if (chip->ecc.mode == NAND_ECC_SOFT &&
 	    chip->ecc.algo == NAND_ECC_BCH)
 		nand_bch_free((struct nand_bch_control *)chip->ecc.priv);

-	mtd_device_unregister(mtd);
-
 	/* Free bad block table memory */
 	kfree(chip->bbt);
 	if (!(chip->options & NAND_OWN_BUFFERS))
@@ -4623,6 +4626,8 @@ void nand_release(struct mtd_info *mtd)
 	if (chip->badblock_pattern && chip->badblock_pattern->options
 			& NAND_BBT_DYNAMICSTRUCT)
 		kfree(chip->badblock_pattern);
+
+	return 0;
 }
 EXPORT_SYMBOL_GPL(nand_release);

diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
index fbe8e16..c15b1c4 100644
--- a/include/linux/mtd/nand.h
+++ b/include/linux/mtd/nand.h
@@ -39,7 +39,7 @@ extern int nand_scan_ident(struct mtd_info *mtd, int max_chips,
 extern int nand_scan_tail(struct mtd_info *mtd);

 /* Free resources held by the NAND device */
-extern void nand_release(struct mtd_info *mtd);
+extern int nand_release(struct mtd_info *mtd);

 /* Internal helper for board drivers which need to override command function */
 extern void nand_wait_ready(struct mtd_info *mtd);

Thanks,
//richard

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Race-free NAND device removal
  2016-07-03 13:38 Race-free NAND device removal Richard Weinberger
@ 2016-07-04  9:16 ` Boris Brezillon
  2016-07-04  9:44   ` Richard Weinberger
  2016-07-04 21:34   ` Richard Weinberger
  0 siblings, 2 replies; 8+ messages in thread
From: Boris Brezillon @ 2016-07-04  9:16 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd@lists.infradead.org, Brian Norris

On Sun, 3 Jul 2016 15:38:42 +0200
Richard Weinberger <richard@nod.at> wrote:

> Hi!
> 
> While working on nandsim I realized that nand_release() ignores the return
> value from mtd_device_unregister().
> 
> That means NAND devices cannot removed in a race-free manner.
> Consider a NAND driver that registers ->_get_device() and ->_put_device()
> callbacks for refcounting. In its removal function it will return -EBUSY
> whenever the refcount is > 0.
> But when device is claimed while removing it, it can happen that the refcount
> increments after the check.
> MTD can deal with that and mtd_device_unregister() will return EBUSY.
> But nand_release() won't notice and the NAND driver continues with the tear down
> process.

Yes, I already noticed that, and apparently all NAND controller drivers
seem to assume that nand_release() always succeed. It's definitely a
bug, since the MTD device will still be exposed, but the underlying
NAND structure (and the associated data + implementation) will be
gone :-/.

> 
> Would be a change like the following one acceptable or is a NAND driver
> allowed to call mtd_device_unregister() itself?
> AFAICT the additional call to mtd_device_unregister() in nand_release() would
> be an nop then.

This patch looks good, but NAND controller drivers will keep ignoring
the nand_release() return code and release their own private data, so
implementations are still buggy ;).

This whole NAND dev registration/deregistration is unsafe, and I plan
to rework it when moving to a controller <-> chips infrastructure.

Are you fixing a real bug or just a potential one? Cause I'm not sure
doing that is any safer if we don't patch all the NAND controller
drivers...

> 
> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index 0b0dc29..dc76bc6 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -4604,16 +4604,19 @@ EXPORT_SYMBOL(nand_scan);
>   * nand_release - [NAND Interface] Free resources held by the NAND device
>   * @mtd: MTD device structure
>   */
> -void nand_release(struct mtd_info *mtd)
> +int nand_release(struct mtd_info *mtd)
>  {
> +	int ret;
>  	struct nand_chip *chip = mtd_to_nand(mtd);
> 
> +	ret = mtd_device_unregister(mtd);
> +	if (ret)
> +		return ret;
> +
>  	if (chip->ecc.mode == NAND_ECC_SOFT &&
>  	    chip->ecc.algo == NAND_ECC_BCH)
>  		nand_bch_free((struct nand_bch_control *)chip->ecc.priv);
> 
> -	mtd_device_unregister(mtd);
> -
>  	/* Free bad block table memory */
>  	kfree(chip->bbt);
>  	if (!(chip->options & NAND_OWN_BUFFERS))
> @@ -4623,6 +4626,8 @@ void nand_release(struct mtd_info *mtd)
>  	if (chip->badblock_pattern && chip->badblock_pattern->options
>  			& NAND_BBT_DYNAMICSTRUCT)
>  		kfree(chip->badblock_pattern);
> +
> +	return 0;
>  }
>  EXPORT_SYMBOL_GPL(nand_release);
> 
> diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
> index fbe8e16..c15b1c4 100644
> --- a/include/linux/mtd/nand.h
> +++ b/include/linux/mtd/nand.h
> @@ -39,7 +39,7 @@ extern int nand_scan_ident(struct mtd_info *mtd, int max_chips,
>  extern int nand_scan_tail(struct mtd_info *mtd);
> 
>  /* Free resources held by the NAND device */
> -extern void nand_release(struct mtd_info *mtd);
> +extern int nand_release(struct mtd_info *mtd);
> 
>  /* Internal helper for board drivers which need to override command function */
>  extern void nand_wait_ready(struct mtd_info *mtd);
> 
> Thanks,
> //richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Race-free NAND device removal
  2016-07-04  9:16 ` Boris Brezillon
@ 2016-07-04  9:44   ` Richard Weinberger
  2016-07-04 10:06     ` Boris Brezillon
  2016-07-04 21:34   ` Richard Weinberger
  1 sibling, 1 reply; 8+ messages in thread
From: Richard Weinberger @ 2016-07-04  9:44 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd@lists.infradead.org, Brian Norris



Am 04.07.2016 um 11:16 schrieb Boris Brezillon:
> On Sun, 3 Jul 2016 15:38:42 +0200
> Richard Weinberger <richard@nod.at> wrote:
> 
>> Hi!
>>
>> While working on nandsim I realized that nand_release() ignores the return
>> value from mtd_device_unregister().
>>
>> That means NAND devices cannot removed in a race-free manner.
>> Consider a NAND driver that registers ->_get_device() and ->_put_device()
>> callbacks for refcounting. In its removal function it will return -EBUSY
>> whenever the refcount is > 0.
>> But when device is claimed while removing it, it can happen that the refcount
>> increments after the check.
>> MTD can deal with that and mtd_device_unregister() will return EBUSY.
>> But nand_release() won't notice and the NAND driver continues with the tear down
>> process.
> 
> Yes, I already noticed that, and apparently all NAND controller drivers
> seem to assume that nand_release() always succeed. It's definitely a
> bug, since the MTD device will still be exposed, but the underlying
> NAND structure (and the associated data + implementation) will be
> gone :-/.

Well, in most cases it will work since the module refcounting kicks in.
And no NAND drivers create/remove MTDs during runtime.

>>
>> Would be a change like the following one acceptable or is a NAND driver
>> allowed to call mtd_device_unregister() itself?
>> AFAICT the additional call to mtd_device_unregister() in nand_release() would
>> be an nop then.
> 
> This patch looks good, but NAND controller drivers will keep ignoring
> the nand_release() return code and release their own private data, so
> implementations are still buggy ;).
> 
> This whole NAND dev registration/deregistration is unsafe, and I plan
> to rework it when moving to a controller <-> chips infrastructure.
> 
> Are you fixing a real bug or just a potential one? Cause I'm not sure
> doing that is any safer if we don't patch all the NAND controller
> drivers...

I'm facing a real issue on nandsim.
Currently I'm heavily reworking nandsim.
One of the new features is that you can add/remove NAND MTDs during runtime
using a userspace tool. It works like losetup.

$ nandsimctl --backend file /home/rw/work/XXX/broken_mtd.raw --id-bytes 0x....

While getting this race free I found that issue.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Race-free NAND device removal
  2016-07-04  9:44   ` Richard Weinberger
@ 2016-07-04 10:06     ` Boris Brezillon
  2016-07-04 11:02       ` Richard Weinberger
  0 siblings, 1 reply; 8+ messages in thread
From: Boris Brezillon @ 2016-07-04 10:06 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd@lists.infradead.org, Brian Norris

On Mon, 4 Jul 2016 11:44:03 +0200
Richard Weinberger <richard@nod.at> wrote:

> Am 04.07.2016 um 11:16 schrieb Boris Brezillon:
> > On Sun, 3 Jul 2016 15:38:42 +0200
> > Richard Weinberger <richard@nod.at> wrote:
> >   
> >> Hi!
> >>
> >> While working on nandsim I realized that nand_release() ignores the return
> >> value from mtd_device_unregister().
> >>
> >> That means NAND devices cannot removed in a race-free manner.
> >> Consider a NAND driver that registers ->_get_device() and ->_put_device()
> >> callbacks for refcounting. In its removal function it will return -EBUSY
> >> whenever the refcount is > 0.
> >> But when device is claimed while removing it, it can happen that the refcount
> >> increments after the check.
> >> MTD can deal with that and mtd_device_unregister() will return EBUSY.
> >> But nand_release() won't notice and the NAND driver continues with the tear down
> >> process.  
> > 
> > Yes, I already noticed that, and apparently all NAND controller drivers
> > seem to assume that nand_release() always succeed. It's definitely a
> > bug, since the MTD device will still be exposed, but the underlying
> > NAND structure (and the associated data + implementation) will be
> > gone :-/.  
> 
> Well, in most cases it will work since the module refcounting kicks in.
> And no NAND drivers create/remove MTDs during runtime.

Yep.

> 
> >>
> >> Would be a change like the following one acceptable or is a NAND driver
> >> allowed to call mtd_device_unregister() itself?
> >> AFAICT the additional call to mtd_device_unregister() in nand_release() would
> >> be an nop then.  
> > 
> > This patch looks good, but NAND controller drivers will keep ignoring
> > the nand_release() return code and release their own private data, so
> > implementations are still buggy ;).
> > 
> > This whole NAND dev registration/deregistration is unsafe, and I plan
> > to rework it when moving to a controller <-> chips infrastructure.
> > 
> > Are you fixing a real bug or just a potential one? Cause I'm not sure
> > doing that is any safer if we don't patch all the NAND controller
> > drivers...  
> 
> I'm facing a real issue on nandsim.
> Currently I'm heavily reworking nandsim.
> One of the new features is that you can add/remove NAND MTDs during runtime
> using a userspace tool. It works like losetup.
> 
> $ nandsimctl --backend file /home/rw/work/XXX/broken_mtd.raw --id-bytes 0x....
> 
> While getting this race free I found that issue.

Okay, so you modified nandsim code to check nand_release() return code,
right? Maybe you can send this change in your nandsim rework series
then.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Race-free NAND device removal
  2016-07-04 10:06     ` Boris Brezillon
@ 2016-07-04 11:02       ` Richard Weinberger
  2016-07-04 11:11         ` Richard Weinberger
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Weinberger @ 2016-07-04 11:02 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd@lists.infradead.org, Brian Norris

Am 04.07.2016 um 12:06 schrieb Boris Brezillon:
>> $ nandsimctl --backend file /home/rw/work/XXX/broken_mtd.raw --id-bytes 0x....
>>
>> While getting this race free I found that issue.
> 
> Okay, so you modified nandsim code to check nand_release() return code,
> right? Maybe you can send this change in your nandsim rework series
> then.

Yep. My code checks the result of nand_release().
I'll carry it in my series.

BTW: There is more fun:
When we look into mtdcore.c
int mtd_device_unregister(struct mtd_info *master)
{
        int err;

        if (master->_reboot)
                unregister_reboot_notifier(&master->reboot_notifier);

        err = del_mtd_partitions(master);
        if (err)
                return err;

        if (!device_is_registered(&master->dev))
                return 0;

        return del_mtd_device(master);
}
EXPORT_SYMBOL_GPL(mtd_device_unregister);

Either del_mtd_partitions() or del_mtd_device() will notice that the MTD usage count is > 0 and
return -EBUSY.
But at this stage we've already executed the reboot notifier. Bug or feature? ;-)

I'm also not sure about the printk in del_mtd_device():
        if (mtd->usecount) {
                printk(KERN_NOTICE "Removing MTD device #%d (%s) with use count %d\n",
                       mtd->index, mtd->name, mtd->usecount);
                ret = -EBUSY;
        } else {

Why do you have to warn the user? Is this 100% a legit use case or is the printk here to warn
that a driver is buggy?
At least with the existing UBI glubi driver you can hit this code path.
Same for the upcoming nandsim changes.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Race-free NAND device removal
  2016-07-04 11:02       ` Richard Weinberger
@ 2016-07-04 11:11         ` Richard Weinberger
  2016-07-04 12:02           ` Boris Brezillon
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Weinberger @ 2016-07-04 11:11 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: Boris Brezillon, Brian Norris, linux-mtd@lists.infradead.org

On Mon, Jul 4, 2016 at 1:02 PM, Richard Weinberger <richard@nod.at> wrote:
> Am 04.07.2016 um 12:06 schrieb Boris Brezillon:
>>> $ nandsimctl --backend file /home/rw/work/XXX/broken_mtd.raw --id-bytes 0x....
>>>
>>> While getting this race free I found that issue.
>>
>> Okay, so you modified nandsim code to check nand_release() return code,
>> right? Maybe you can send this change in your nandsim rework series
>> then.
>
> Yep. My code checks the result of nand_release().
> I'll carry it in my series.
>
> BTW: There is more fun:
> When we look into mtdcore.c
> int mtd_device_unregister(struct mtd_info *master)
> {
>         int err;
>
>         if (master->_reboot)
>                 unregister_reboot_notifier(&master->reboot_notifier);
>
>         err = del_mtd_partitions(master);
>         if (err)
>                 return err;
>
>         if (!device_is_registered(&master->dev))
>                 return 0;
>
>         return del_mtd_device(master);
> }
> EXPORT_SYMBOL_GPL(mtd_device_unregister);
>
> Either del_mtd_partitions() or del_mtd_device() will notice that the MTD usage count is > 0 and
> return -EBUSY.
> But at this stage we've already executed the reboot notifier. Bug or feature? ;-)

Should be read removed the...

> I'm also not sure about the printk in del_mtd_device():
>         if (mtd->usecount) {
>                 printk(KERN_NOTICE "Removing MTD device #%d (%s) with use count %d\n",
>                        mtd->index, mtd->name, mtd->usecount);
>                 ret = -EBUSY;
>         } else {
>
> Why do you have to warn the user? Is this 100% a legit use case or is the printk here to warn
> that a driver is buggy?
> At least with the existing UBI glubi driver you can hit this code path.
> Same for the upcoming nandsim changes.
>
> Thanks,
> //richard
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/



-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Race-free NAND device removal
  2016-07-04 11:11         ` Richard Weinberger
@ 2016-07-04 12:02           ` Boris Brezillon
  0 siblings, 0 replies; 8+ messages in thread
From: Boris Brezillon @ 2016-07-04 12:02 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: Richard Weinberger, Brian Norris, linux-mtd@lists.infradead.org

On Mon, 4 Jul 2016 13:11:09 +0200
Richard Weinberger <richard.weinberger@gmail.com> wrote:

> On Mon, Jul 4, 2016 at 1:02 PM, Richard Weinberger <richard@nod.at> wrote:
> > Am 04.07.2016 um 12:06 schrieb Boris Brezillon:  
> >>> $ nandsimctl --backend file /home/rw/work/XXX/broken_mtd.raw --id-bytes 0x....
> >>>
> >>> While getting this race free I found that issue.  
> >>
> >> Okay, so you modified nandsim code to check nand_release() return code,
> >> right? Maybe you can send this change in your nandsim rework series
> >> then.  
> >
> > Yep. My code checks the result of nand_release().
> > I'll carry it in my series.
> >
> > BTW: There is more fun:
> > When we look into mtdcore.c
> > int mtd_device_unregister(struct mtd_info *master)
> > {
> >         int err;
> >
> >         if (master->_reboot)
> >                 unregister_reboot_notifier(&master->reboot_notifier);
> >
> >         err = del_mtd_partitions(master);
> >         if (err)
> >                 return err;
> >
> >         if (!device_is_registered(&master->dev))
> >                 return 0;
> >
> >         return del_mtd_device(master);
> > }
> > EXPORT_SYMBOL_GPL(mtd_device_unregister);
> >
> > Either del_mtd_partitions() or del_mtd_device() will notice that the MTD usage count is > 0 and
> > return -EBUSY.
> > But at this stage we've already executed the reboot notifier. Bug or feature? ;-)  
> 
> Should be read removed the...

Probably a bug.

> 
> > I'm also not sure about the printk in del_mtd_device():
> >         if (mtd->usecount) {
> >                 printk(KERN_NOTICE "Removing MTD device #%d (%s) with use count %d\n",
> >                        mtd->index, mtd->name, mtd->usecount);
> >                 ret = -EBUSY;
> >         } else {
> >
> > Why do you have to warn the user? Is this 100% a legit use case or is the printk here to warn
> > that a driver is buggy?
> > At least with the existing UBI glubi driver you can hit this code path.
> > Same for the upcoming nandsim changes.
> >
> > Thanks,
> > //richard
> >
> > ______________________________________________________
> > Linux MTD discussion mailing list
> > http://lists.infradead.org/mailman/listinfo/linux-mtd/  
> 
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Race-free NAND device removal
  2016-07-04  9:16 ` Boris Brezillon
  2016-07-04  9:44   ` Richard Weinberger
@ 2016-07-04 21:34   ` Richard Weinberger
  1 sibling, 0 replies; 8+ messages in thread
From: Richard Weinberger @ 2016-07-04 21:34 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd@lists.infradead.org, Brian Norris

Am 04.07.2016 um 11:16 schrieb Boris Brezillon:
> On Sun, 3 Jul 2016 15:38:42 +0200
> Richard Weinberger <richard@nod.at> wrote:
> 
>> Hi!
>>
>> While working on nandsim I realized that nand_release() ignores the return
>> value from mtd_device_unregister().
>>
>> That means NAND devices cannot removed in a race-free manner.
>> Consider a NAND driver that registers ->_get_device() and ->_put_device()
>> callbacks for refcounting. In its removal function it will return -EBUSY
>> whenever the refcount is > 0.
>> But when device is claimed while removing it, it can happen that the refcount
>> increments after the check.
>> MTD can deal with that and mtd_device_unregister() will return EBUSY.
>> But nand_release() won't notice and the NAND driver continues with the tear down
>> process.
> 
> Yes, I already noticed that, and apparently all NAND controller drivers
> seem to assume that nand_release() always succeed. It's definitely a
> bug, since the MTD device will still be exposed, but the underlying
> NAND structure (and the associated data + implementation) will be
> gone :-/.
> 
>>
>> Would be a change like the following one acceptable or is a NAND driver
>> allowed to call mtd_device_unregister() itself?
>> AFAICT the additional call to mtd_device_unregister() in nand_release() would
>> be an nop then.
> 
> This patch looks good, but NAND controller drivers will keep ignoring
> the nand_release() return code and release their own private data, so
> implementations are still buggy ;).
> 
> This whole NAND dev registration/deregistration is unsafe, and I plan
> to rework it when moving to a controller <-> chips infrastructure.
> 
> Are you fixing a real bug or just a potential one? Cause I'm not sure
> doing that is any safer if we don't patch all the NAND controller
> drivers...

Just figured that drivers/mtd/nand/r852.c also registers/removes NAND devices during
runtime.
AFAICT it is broken.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-07-04 21:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-03 13:38 Race-free NAND device removal Richard Weinberger
2016-07-04  9:16 ` Boris Brezillon
2016-07-04  9:44   ` Richard Weinberger
2016-07-04 10:06     ` Boris Brezillon
2016-07-04 11:02       ` Richard Weinberger
2016-07-04 11:11         ` Richard Weinberger
2016-07-04 12:02           ` Boris Brezillon
2016-07-04 21:34   ` Richard Weinberger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).