* [RFC] error management in add_disk()
@ 2007-07-24 11:57 Alban Crequy
2007-07-24 13:28 ` Al Viro
0 siblings, 1 reply; 3+ messages in thread
From: Alban Crequy @ 2007-07-24 11:57 UTC (permalink / raw)
To: jens.axboe; +Cc: linux-kernel
Hi,
I have a problem with the error management of add_disk() and del_gendisk().
add_disk() adds an entry in /sys/block/<name>. The filename in /sys/block is
not (struct gen_disk)->disk_name but more or less the first KOBJ_NAME_LEN
characters of (struct gen_disk)->disk_name.
#define KOBJ_NAME_LEN 20
My problem occurs when we try to add 2 disks with different names, but when
the KOBJ_NAME_LEN first characters are the same.
It is not allowed to have several files in /sys/block/ with the same name. It
does not prevent the disk to work, but the creation of the file in /sys/block
will silently fail. See the call chain:
add_disk() -> register_disk() -> kobject_add(&disk->kobj)
del_gendisk() -> kobject_del(&disk->kobj)
add_disk() and register_disk() return void, so the caller cannot know that
there is a problem. kobject_add() returns an error but register_disk() ignores
the error.
The disk driver is unaware of the failure of disk_add(), and may call
del_gendisk(). It deletes an object in /sys/block/ that was not added (bug!).
The disk driver cannot check that, so the reference counter of /sys/block is
decremented on kobject_del().
If the user run add_disk() and del_gendisk() a lot of times with different
names having the same 20-characters beginning, the reference counter of the
/sys/block directory will reach 0 (and it will be freed) although it still
contains files.
The attached test module triggers the problem. You can try something like:
for i in $(seq 1 100) ; do insmod ./adddiskbug.ko ; rmmod adddiskbug ; done
The attached patch fixes the problem by changing the prototype of add_disk() and
register_disk() to return errors.
Index: linux-2.6.23-rc1/block/genhd.c
===================================================================
--- linux-2.6.23-rc1.orig/block/genhd.c 2007-07-23 14:52:11.000000000 +0200
+++ linux-2.6.23-rc1/block/genhd.c 2007-07-23 18:08:58.000000000 +0200
@@ -175,13 +175,22 @@
* This function registers the partitioning information in @disk
* with the kernel.
*/
-void add_disk(struct gendisk *disk)
+int add_disk(struct gendisk *disk)
{
+ int ret;
+
disk->flags |= GENHD_FL_UP;
blk_register_region(MKDEV(disk->major, disk->first_minor),
disk->minors, NULL, exact_match, exact_lock, disk);
- register_disk(disk);
+ ret = register_disk(disk);
+ if (ret) {
+ blk_unregister_region(MKDEV(disk->major, disk->first_minor),
+ disk->minors);
+ return ret;
+ }
blk_register_queue(disk);
+
+ return 0;
}
EXPORT_SYMBOL(add_disk);
Index: linux-2.6.23-rc1/include/linux/genhd.h
===================================================================
--- linux-2.6.23-rc1.orig/include/linux/genhd.h 2007-07-09 01:32:17.000000000 +0200
+++ linux-2.6.23-rc1/include/linux/genhd.h 2007-07-23 15:12:53.000000000 +0200
@@ -235,7 +235,7 @@
/* drivers/block/genhd.c */
extern int get_blkdev_list(char *, int);
-extern void add_disk(struct gendisk *disk);
+extern int add_disk(struct gendisk *disk);
extern void del_gendisk(struct gendisk *gp);
extern void unlink_gendisk(struct gendisk *gp);
extern struct gendisk *get_gendisk(dev_t dev, int *part);
Index: linux-2.6.23-rc1/fs/partitions/check.c
===================================================================
--- linux-2.6.23-rc1.orig/fs/partitions/check.c 2007-07-23 14:52:13.000000000 +0200
+++ linux-2.6.23-rc1/fs/partitions/check.c 2007-07-23 15:53:04.000000000 +0200
@@ -469,7 +469,7 @@
}
/* Not exported, helper to add_disk(). */
-void register_disk(struct gendisk *disk)
+int register_disk(struct gendisk *disk)
{
struct block_device *bdev;
char *s;
@@ -483,11 +483,11 @@
if (s)
*s = '!';
if ((err = kobject_add(&disk->kobj)))
- return;
+ return -EEXIST;
err = disk_sysfs_symlinks(disk);
if (err) {
kobject_del(&disk->kobj);
- return;
+ return -EEXIST;
}
disk_sysfs_add_subdirs(disk);
@@ -523,6 +523,8 @@
continue;
kobject_uevent(&p->kobj, KOBJ_ADD);
}
+
+ return 0;
}
int rescan_partitions(struct gendisk *disk, struct block_device *bdev)
Index: linux-2.6.23-rc1/include/linux/blkdev.h
===================================================================
--- linux-2.6.23-rc1.orig/include/linux/blkdev.h 2007-07-23 14:52:14.000000000 +0200
+++ linux-2.6.23-rc1/include/linux/blkdev.h 2007-07-23 15:58:47.000000000 +0200
@@ -643,7 +643,7 @@
extern int blk_register_queue(struct gendisk *disk);
extern void blk_unregister_queue(struct gendisk *disk);
-extern void register_disk(struct gendisk *dev);
+extern int register_disk(struct gendisk *dev);
extern void generic_make_request(struct bio *bio);
extern void blk_put_request(struct request *);
extern void __blk_put_request(request_queue_t *, struct request *);
My test module:
----->8---------->8---------->8-----
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/blkdev.h>
#include <linux/genhd.h>
#include <linux/fs.h>
#include <linux/init.h>
#define DRIVER_NAME "adddiskbug"
int adddiskbug_init(void);
void adddiskbug_exit(void);
struct block_device_operations adddiskbug_fops = {
.owner = THIS_MODULE,
.open = NULL,
.release = NULL,
.ioctl = NULL,
};
int major_number;
struct gendisk *gd[256];
void
print_refcount(struct gendisk *_gd, const char *str)
{
struct kobject *obj;
printk("%s\n", str);
printk("name=%s: count=%d\n",
_gd->disk_name,
atomic_read(& _gd->kobj.kref.refcount));
for (obj = & _gd->kobj ; obj ; obj = obj->parent)
{
printk(" obj %p '%s': %p '%s' count=%d\n",
obj, obj->name, obj->k_name, obj->k_name?obj->k_name:"",
atomic_read(& obj->kref.refcount));
}
}
int new_disk(const char *name, int minor)
{
int ret = 0;
if (strlen(name) >= 32)
{
printk("ERROR: Name '%s' too long\n", name);
return -EINVAL;
}
gd[minor] = alloc_disk(1);
if (!gd[minor])
return -ENOMEM;
print_refcount(gd[0], "After alloc_disk:\n");
gd[minor]->major = major_number;
gd[minor]->first_minor = minor;
gd[minor]->fops = &adddiskbug_fops;
strlcpy(gd[minor]->disk_name, name, sizeof(gd[minor]->disk_name));
set_capacity(gd[minor], (sector_t) 8192); /* 4MB */
gd[minor]->queue = blk_alloc_queue(GFP_KERNEL);
if (! gd[minor]->queue)
{
put_disk(gd[minor]);
return -ENOMEM;
}
#ifndef KERNEL_WITH_MY_PATCH
add_disk(gd[minor]);
#else
ret = add_disk(gd[minor]);
if (ret)
{
blk_cleanup_queue(gd[minor]->queue);
put_disk(gd[minor]);
return ret;
}
#endif
return ret;
}
void delete_disk(int minor)
{
struct gendisk *_gd = gd[minor];
blk_cleanup_queue(_gd->queue);
del_gendisk(_gd);
put_disk(_gd);
}
int
adddiskbug_init(void)
{
int ret;
printk(" ==== init ===== \n");
major_number = register_blkdev(0, DRIVER_NAME);
if (major_number < 0)
return -EINVAL;
ret = new_disk("my_bogus_driver_number_1", 0);
if (ret < 0)
{
unregister_blkdev(major_number, DRIVER_NAME);
return -EINVAL;
}
print_refcount(gd[0], "after new_disk 0");
ret = new_disk("my_bogus_driver_number_2", 1);
if (ret < 0)
{
delete_disk(0);
unregister_blkdev(major_number, DRIVER_NAME);
return -EINVAL;
}
print_refcount(gd[0], "after new_disk 1");
return 0;
}
void
adddiskbug_exit(void)
{
print_refcount(gd[0], "before delete_disk 0");
delete_disk(1);
print_refcount(gd[0], "before delete_disk 1");
delete_disk(0);
unregister_blkdev(major_number, DRIVER_NAME);
}
module_init(adddiskbug_init);
module_exit(adddiskbug_exit);
----->8---------->8---------->8-----
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC] error management in add_disk()
2007-07-24 11:57 [RFC] error management in add_disk() Alban Crequy
@ 2007-07-24 13:28 ` Al Viro
2007-07-25 15:29 ` Alban Crequy
0 siblings, 1 reply; 3+ messages in thread
From: Al Viro @ 2007-07-24 13:28 UTC (permalink / raw)
To: Alban Crequy; +Cc: jens.axboe, linux-kernel
On Tue, Jul 24, 2007 at 01:57:53PM +0200, Alban Crequy wrote:
> Hi,
>
> I have a problem with the error management of add_disk() and del_gendisk().
>
> add_disk() adds an entry in /sys/block/<name>. The filename in /sys/block is
> not (struct gen_disk)->disk_name but more or less the first KOBJ_NAME_LEN
> characters of (struct gen_disk)->disk_name.
>
> #define KOBJ_NAME_LEN 20
>
> My problem occurs when we try to add 2 disks with different names, but when
> the KOBJ_NAME_LEN first characters are the same.
So don't do that.
> The attached test module triggers the problem. You can try something like:
> for i in $(seq 1 100) ; do insmod ./adddiskbug.ko ; rmmod adddiskbug ; done
>
> The attached patch fixes the problem by changing the prototype of add_disk() and
> register_disk() to return errors.
This is bogus. Just what would callers do with these error values?
Ignore them silently? Bail out? Can't do - at that point disk just might
have been opened already. add_disk() is the point of no return; we are
already past the last point where we could bail out.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC] error management in add_disk()
2007-07-24 13:28 ` Al Viro
@ 2007-07-25 15:29 ` Alban Crequy
0 siblings, 0 replies; 3+ messages in thread
From: Alban Crequy @ 2007-07-25 15:29 UTC (permalink / raw)
To: Al Viro; +Cc: jens.axboe, linux-kernel
Le Tue, 24 Jul 2007 14:28:05 +0100,
Al Viro <viro@ftp.linux.org.uk> a écrit :
>On Tue, Jul 24, 2007 at 01:57:53PM +0200, Alban Crequy wrote:
>> Hi,
>>
>> I have a problem with the error management of add_disk() and
>> del_gendisk().
>>
>> add_disk() adds an entry in /sys/block/<name>. The filename
>> in /sys/block is not (struct gen_disk)->disk_name but more or less
>> the first KOBJ_NAME_LEN characters of (struct gen_disk)->disk_name.
>>
>> #define KOBJ_NAME_LEN 20
>>
>> My problem occurs when we try to add 2 disks with different names,
>> but when the KOBJ_NAME_LEN first characters are the same.
>
>So don't do that.
I no more do that. But I still think it would be better if we found a
way to manage errors in that case.
I fear that parts of kernel make this error. For example, old version of
GFS has this code:
http://csourcesearch.net/package/gfs-kernel/2.6.9/gfs-kernel-2.6.9-27/src/gfs/diaper.c
char buf[BDEVNAME_SIZE];
bdevname(real, buf);
snprintf(gd->disk_name, sizeof(gd->disk_name), "diapered_%s", buf);
Since BDEVNAME_SIZE is 32 and KOBJ_NAME_LEN is 20, the bug happens quite
easily.
I did not check closely if this is a problem, but there is other parts
in the current kernel that build the disk_name with snprintf("...%s...")
>> The attached test module triggers the problem. You can try something
>> like: for i in $(seq 1 100) ; do insmod ./adddiskbug.ko ; rmmod
>> adddiskbug ; done
>>
>> The attached patch fixes the problem by changing the prototype of
>> add_disk() and register_disk() to return errors.
>
>This is bogus. Just what would callers do with these error values?
>Ignore them silently? Bail out? Can't do - at that point disk just
>might have been opened already. add_disk() is the point of no return;
>we are already past the last point where we could bail out.
I missed that point - that the disk might have been opened. Where is
the point of no return in add_disk() exactly? Is it really before the
kobject_add() that causes the problem?
In this case, perhaps we can 1/ check that the kobject_add() will not
fail before the point of no return, 2/ pass this point and then 3/ do
the kobject_add(). And add appropriate locking to ensure that nobody
add another disk with the same 20-characters truncated name between 1/
and 3/.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2007-07-25 15:30 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-24 11:57 [RFC] error management in add_disk() Alban Crequy
2007-07-24 13:28 ` Al Viro
2007-07-25 15:29 ` Alban Crequy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox