* [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log
@ 2010-06-01 9:56 NeilBrown
2010-06-01 9:56 ` [PATCH 01/24] md: reduce dependence on sysfs NeilBrown
` (23 more replies)
0 siblings, 24 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
Nearly two months ago I posted my first serious attempt and making
md/raid5.c work as a dm target. I met with cautious approval I think,
but as it didn't interact with dirty-logs yet it wasn't really read
for prime time.
It has taken longer than I hoped, but here is version 2, now with
dirty-log integration.
I have done a modest amount of testing, watching the bits in the log
getting cleared and set just as you would expect, and watching the
resync complete instantly when the dirty-log shows that all regions
are clean.
There is not even a hint of cluster support yet, but that shouldn't be
necessary for initial submission to mainline.
A significant difference to Heinz' dm-raid45 is that I only honour the
table options that lvm actually sets. The extra ones don't really fit
with md/raid5 and I don't think they need to be table options. If any
are needed they might work OK as messages (???).
There are a number of changes to core-dm in here including:
- support for targets to be unplugged when the device is
- support for targets to report congestion beyond the congestion
of component devices
- support for the dirty-log to cover and extent different from the
size of the target (For raid5 it must be the size of the
components).
I have tried to fit these to the general style of dm as best as I can.
There is certainly room for more testing and review, but I would like
to see this entering -next soon with a view to seeing it merged in the
next merge window.
Is this reasonable? Achievable?
Comments?
These patches can all be found on the "md-dm-raid45" branch of
git://neil.brown.name/md/
or at http://neil.brown.name/git?p=md;a=shortlog;h=refs/heads/md-dm-raid45
Thanks,
NeilBrown
---
NeilBrown (24):
md: reduce dependence on sysfs.
md/raid5: factor out code for changing size of stripe cache.
md/raid5: ensure we create a unique name for kmem_cache when mddev has no gendisk
md: be more careful setting MD_CHANGE_CLEAN
md: split out md_rdev_init
md: export various start/stop interfaces
md/dm: create dm-raid456 module using md/raid5
dm-raid456: add support for raising events to userspace.
raid5: Don't set read-ahead when there is no queue
dm-raid456: add congestion checking.
md/raid5: add simple plugging infrastructure.
md/plug: optionally use plugger to unplug an array during resync/recovery.
dm-raid456: support unplug
dm-raid456: add support for setting IO hints.
dm-raid456: add suspend/resume method
dm-raid456: add message handler.
md/bitmap: white space clean up and similar.
md/bitmap: reduce dependence on sysfs.
md/bitmap: clean up plugging calls.
md/bitmap: optimise scanning of empty bitmaps.
dm-dirty-log: allow log size to be different from target size.
md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log.
md/bitmap: separate out loading a bitmap from initialising the structures.
dm-raid456: switch to use dm_dirty_log for tracking dirty regions.
drivers/md/Kconfig | 8 +
drivers/md/Makefile | 1
drivers/md/bitmap.c | 508 +++++++++++++++++---------------
drivers/md/bitmap.h | 6
drivers/md/dm-log-userspace-base.c | 11 -
drivers/md/dm-log.c | 18 +
drivers/md/dm-raid1.c | 4
drivers/md/dm-raid456.c | 576 ++++++++++++++++++++++++++++++++++++
drivers/md/dm-table.c | 19 +
drivers/md/md.c | 234 +++++++++------
drivers/md/md.h | 51 +++
drivers/md/raid5.c | 169 ++++++-----
drivers/md/raid5.h | 8 -
include/linux/device-mapper.h | 13 +
include/linux/dm-dirty-log.h | 3
15 files changed, 1235 insertions(+), 394 deletions(-)
create mode 100644 drivers/md/dm-raid456.c
--
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 01/24] md: reduce dependence on sysfs.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 03/24] md/raid5: ensure we create a unique name for kmem_cache when mddev has no gendisk NeilBrown
` (22 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
We will want md devices to live as dm targets where sysfs is not
visible. So allow md to not connect to sysfs.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/md.c | 101 ++++++++++++++++++++++++----------------------------
drivers/md/md.h | 12 ++++++
drivers/md/raid5.c | 8 ++--
3 files changed, 62 insertions(+), 59 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 46b3a04..6fbd1c7 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -541,14 +541,16 @@ static void mddev_unlock(mddev_t * mddev)
mutex_lock(&mddev->open_mutex);
mutex_unlock(&mddev->reconfig_mutex);
- if (to_remove != &md_redundancy_group)
- sysfs_remove_group(&mddev->kobj, to_remove);
- if (mddev->pers == NULL ||
- mddev->pers->sync_request == NULL) {
- sysfs_remove_group(&mddev->kobj, &md_redundancy_group);
- if (mddev->sysfs_action)
- sysfs_put(mddev->sysfs_action);
- mddev->sysfs_action = NULL;
+ if (mddev->kobj.sd) {
+ if (to_remove != &md_redundancy_group)
+ sysfs_remove_group(&mddev->kobj, to_remove);
+ if (mddev->pers == NULL ||
+ mddev->pers->sync_request == NULL) {
+ sysfs_remove_group(&mddev->kobj, &md_redundancy_group);
+ if (mddev->sysfs_action)
+ sysfs_put(mddev->sysfs_action);
+ mddev->sysfs_action = NULL;
+ }
}
mutex_unlock(&mddev->open_mutex);
} else
@@ -1811,11 +1813,9 @@ static int bind_rdev_to_array(mdk_rdev_t * rdev, mddev_t * mddev)
goto fail;
ko = &part_to_dev(rdev->bdev->bd_part)->kobj;
- if ((err = sysfs_create_link(&rdev->kobj, ko, "block"))) {
- kobject_del(&rdev->kobj);
- goto fail;
- }
- rdev->sysfs_state = sysfs_get_dirent(rdev->kobj.sd, NULL, "state");
+ if (sysfs_create_link(&rdev->kobj, ko, "block"))
+ /* failure here is OK */;
+ rdev->sysfs_state = sysfs_get_dirent_safe(rdev->kobj.sd, "state");
list_add_rcu(&rdev->same_set, &mddev->disks);
bd_claim_by_disk(rdev->bdev, rdev->bdev->bd_holder, mddev->gendisk);
@@ -2333,8 +2333,8 @@ state_store(mdk_rdev_t *rdev, const char *buf, size_t len)
set_bit(In_sync, &rdev->flags);
err = 0;
}
- if (!err && rdev->sysfs_state)
- sysfs_notify_dirent(rdev->sysfs_state);
+ if (!err)
+ sysfs_notify_dirent_safe(rdev->sysfs_state);
return err ? err : len;
}
static struct rdev_sysfs_entry rdev_state =
@@ -2429,14 +2429,10 @@ slot_store(mdk_rdev_t *rdev, const char *buf, size_t len)
rdev->raid_disk = -1;
return err;
} else
- sysfs_notify_dirent(rdev->sysfs_state);
+ sysfs_notify_dirent_safe(rdev->sysfs_state);
sprintf(nm, "rd%d", rdev->raid_disk);
if (sysfs_create_link(&rdev->mddev->kobj, &rdev->kobj, nm))
- printk(KERN_WARNING
- "md: cannot register "
- "%s for %s\n",
- nm, mdname(rdev->mddev));
-
+ /* failure here is OK */;
/* don't wakeup anyone, leave that to userspace. */
} else {
if (slot >= rdev->mddev->raid_disks)
@@ -2446,7 +2442,7 @@ slot_store(mdk_rdev_t *rdev, const char *buf, size_t len)
clear_bit(Faulty, &rdev->flags);
clear_bit(WriteMostly, &rdev->flags);
set_bit(In_sync, &rdev->flags);
- sysfs_notify_dirent(rdev->sysfs_state);
+ sysfs_notify_dirent_safe(rdev->sysfs_state);
}
return len;
}
@@ -3411,7 +3407,7 @@ array_state_store(mddev_t *mddev, const char *buf, size_t len)
if (err)
return err;
else {
- sysfs_notify_dirent(mddev->sysfs_state);
+ sysfs_notify_dirent_safe(mddev->sysfs_state);
return len;
}
}
@@ -3709,7 +3705,7 @@ action_store(mddev_t *mddev, const char *page, size_t len)
}
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
- sysfs_notify_dirent(mddev->sysfs_action);
+ sysfs_notify_dirent_safe(mddev->sysfs_action);
return len;
}
@@ -4255,13 +4251,14 @@ static int md_alloc(dev_t dev, char *name)
disk->disk_name);
error = 0;
}
- if (sysfs_create_group(&mddev->kobj, &md_bitmap_group))
+ if (mddev->kobj.sd &&
+ sysfs_create_group(&mddev->kobj, &md_bitmap_group))
printk(KERN_DEBUG "pointless warning\n");
abort:
mutex_unlock(&disks_mutex);
- if (!error) {
+ if (!error && mddev->kobj.sd) {
kobject_uevent(&mddev->kobj, KOBJ_ADD);
- mddev->sysfs_state = sysfs_get_dirent(mddev->kobj.sd, NULL, "array_state");
+ mddev->sysfs_state = sysfs_get_dirent_safe(mddev->kobj.sd, "array_state");
}
mddev_put(mddev);
return error;
@@ -4299,7 +4296,7 @@ static void md_safemode_timeout(unsigned long data)
if (!atomic_read(&mddev->writes_pending)) {
mddev->safemode = 1;
if (mddev->external)
- sysfs_notify_dirent(mddev->sysfs_state);
+ sysfs_notify_dirent_safe(mddev->sysfs_state);
}
md_wakeup_thread(mddev->thread);
}
@@ -4371,7 +4368,7 @@ static int md_run(mddev_t *mddev)
return -EINVAL;
}
}
- sysfs_notify_dirent(rdev->sysfs_state);
+ sysfs_notify_dirent_safe(rdev->sysfs_state);
}
spin_lock(&pers_lock);
@@ -4470,11 +4467,12 @@ static int md_run(mddev_t *mddev)
return err;
}
if (mddev->pers->sync_request) {
- if (sysfs_create_group(&mddev->kobj, &md_redundancy_group))
+ if (mddev->kobj.sd &&
+ sysfs_create_group(&mddev->kobj, &md_redundancy_group))
printk(KERN_WARNING
"md: cannot register extra attributes for %s\n",
mdname(mddev));
- mddev->sysfs_action = sysfs_get_dirent(mddev->kobj.sd, NULL, "sync_action");
+ mddev->sysfs_action = sysfs_get_dirent_safe(mddev->kobj.sd, "sync_action");
} else if (mddev->ro == 2) /* auto-readonly not meaningful */
mddev->ro = 0;
@@ -4492,8 +4490,7 @@ static int md_run(mddev_t *mddev)
char nm[20];
sprintf(nm, "rd%d", rdev->raid_disk);
if (sysfs_create_link(&mddev->kobj, &rdev->kobj, nm))
- printk("md: cannot register %s for %s\n",
- nm, mdname(mddev));
+ /* failure here is OK */;
}
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
@@ -4505,9 +4502,8 @@ static int md_run(mddev_t *mddev)
md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
md_new_event(mddev);
- sysfs_notify_dirent(mddev->sysfs_state);
- if (mddev->sysfs_action)
- sysfs_notify_dirent(mddev->sysfs_action);
+ sysfs_notify_dirent_safe(mddev->sysfs_state);
+ sysfs_notify_dirent_safe(mddev->sysfs_action);
sysfs_notify(&mddev->kobj, NULL, "degraded");
return 0;
}
@@ -4547,7 +4543,7 @@ static int restart_array(mddev_t *mddev)
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
md_wakeup_thread(mddev->sync_thread);
- sysfs_notify_dirent(mddev->sysfs_state);
+ sysfs_notify_dirent_safe(mddev->sysfs_state);
return 0;
}
@@ -4671,7 +4667,7 @@ static int md_set_readonly(mddev_t *mddev, int is_open)
mddev->ro = 1;
set_disk_ro(mddev->gendisk, 1);
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- sysfs_notify_dirent(mddev->sysfs_state);
+ sysfs_notify_dirent_safe(mddev->sysfs_state);
err = 0;
}
out:
@@ -4704,7 +4700,7 @@ static int do_md_stop(mddev_t * mddev, int mode, int is_open)
mddev->queue->backing_dev_info.congested_fn = NULL;
/* tell userspace to handle 'inactive' */
- sysfs_notify_dirent(mddev->sysfs_state);
+ sysfs_notify_dirent_safe(mddev->sysfs_state);
list_for_each_entry(rdev, &mddev->disks, same_set)
if (rdev->raid_disk >= 0) {
@@ -4750,7 +4746,7 @@ static int do_md_stop(mddev_t * mddev, int mode, int is_open)
err = 0;
blk_integrity_unregister(disk);
md_new_event(mddev);
- sysfs_notify_dirent(mddev->sysfs_state);
+ sysfs_notify_dirent_safe(mddev->sysfs_state);
return err;
}
@@ -5112,7 +5108,7 @@ static int add_new_disk(mddev_t * mddev, mdu_disk_info_t *info)
if (err)
export_rdev(rdev);
else
- sysfs_notify_dirent(rdev->sysfs_state);
+ sysfs_notify_dirent_safe(rdev->sysfs_state);
md_update_sb(mddev, 1);
if (mddev->degraded)
@@ -5787,7 +5783,7 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
if (_IOC_TYPE(cmd) == MD_MAJOR && mddev->ro && mddev->pers) {
if (mddev->ro == 2) {
mddev->ro = 0;
- sysfs_notify_dirent(mddev->sysfs_state);
+ sysfs_notify_dirent_safe(mddev->sysfs_state);
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
} else {
@@ -6032,7 +6028,7 @@ void md_error(mddev_t *mddev, mdk_rdev_t *rdev)
mddev->pers->error_handler(mddev,rdev);
if (mddev->degraded)
set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
- sysfs_notify_dirent(rdev->sysfs_state);
+ sysfs_notify_dirent_safe(rdev->sysfs_state);
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
@@ -6493,7 +6489,7 @@ void md_write_start(mddev_t *mddev, struct bio *bi)
spin_unlock_irq(&mddev->write_lock);
}
if (did_change)
- sysfs_notify_dirent(mddev->sysfs_state);
+ sysfs_notify_dirent_safe(mddev->sysfs_state);
wait_event(mddev->sb_wait,
!test_bit(MD_CHANGE_CLEAN, &mddev->flags) &&
!test_bit(MD_CHANGE_PENDING, &mddev->flags));
@@ -6536,7 +6532,7 @@ int md_allow_write(mddev_t *mddev)
mddev->safemode = 1;
spin_unlock_irq(&mddev->write_lock);
md_update_sb(mddev, 0);
- sysfs_notify_dirent(mddev->sysfs_state);
+ sysfs_notify_dirent_safe(mddev->sysfs_state);
} else
spin_unlock_irq(&mddev->write_lock);
@@ -6922,10 +6918,7 @@ static int remove_and_add_spares(mddev_t *mddev)
sprintf(nm, "rd%d", rdev->raid_disk);
if (sysfs_create_link(&mddev->kobj,
&rdev->kobj, nm))
- printk(KERN_WARNING
- "md: cannot register "
- "%s for %s\n",
- nm, mdname(mddev));
+ /* failure here is OK */;
spares++;
md_new_event(mddev);
set_bit(MD_CHANGE_DEVS, &mddev->flags);
@@ -7018,7 +7011,7 @@ void md_check_recovery(mddev_t *mddev)
mddev->safemode = 0;
spin_unlock_irq(&mddev->write_lock);
if (did_change)
- sysfs_notify_dirent(mddev->sysfs_state);
+ sysfs_notify_dirent_safe(mddev->sysfs_state);
}
if (mddev->flags)
@@ -7057,7 +7050,7 @@ void md_check_recovery(mddev_t *mddev)
mddev->recovery = 0;
/* flag recovery needed just to double check */
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
- sysfs_notify_dirent(mddev->sysfs_action);
+ sysfs_notify_dirent_safe(mddev->sysfs_action);
md_new_event(mddev);
goto unlock;
}
@@ -7119,7 +7112,7 @@ void md_check_recovery(mddev_t *mddev)
mddev->recovery = 0;
} else
md_wakeup_thread(mddev->sync_thread);
- sysfs_notify_dirent(mddev->sysfs_action);
+ sysfs_notify_dirent_safe(mddev->sysfs_action);
md_new_event(mddev);
}
unlock:
@@ -7128,7 +7121,7 @@ void md_check_recovery(mddev_t *mddev)
if (test_and_clear_bit(MD_RECOVERY_RECOVER,
&mddev->recovery))
if (mddev->sysfs_action)
- sysfs_notify_dirent(mddev->sysfs_action);
+ sysfs_notify_dirent_safe(mddev->sysfs_action);
}
mddev_unlock(mddev);
}
@@ -7136,7 +7129,7 @@ void md_check_recovery(mddev_t *mddev)
void md_wait_for_blocked_rdev(mdk_rdev_t *rdev, mddev_t *mddev)
{
- sysfs_notify_dirent(rdev->sysfs_state);
+ sysfs_notify_dirent_safe(rdev->sysfs_state);
wait_event_timeout(rdev->blocked_wait,
!test_bit(Blocked, &rdev->flags),
msecs_to_jiffies(5000));
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 7ab5ea1..1f680e4 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -379,6 +379,18 @@ struct md_sysfs_entry {
};
extern struct attribute_group md_bitmap_group;
+static inline struct sysfs_dirent *sysfs_get_dirent_safe(struct sysfs_dirent *sd, char *name)
+{
+ if (sd)
+ return sysfs_get_dirent(sd, NULL, name);
+ return sd;
+}
+static inline void sysfs_notify_dirent_safe(struct sysfs_dirent *sd)
+{
+ if (sd)
+ sysfs_notify_dirent(sd);
+}
+
static inline char * mdname (mddev_t * mddev)
{
return mddev->gendisk ? mddev->gendisk->disk_name : "mdX";
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index d2c0f94..8882da3 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -5070,7 +5070,8 @@ static int run(mddev_t *mddev)
/* Ok, everything is just fine now */
if (mddev->to_remove == &raid5_attrs_group)
mddev->to_remove = NULL;
- else if (sysfs_create_group(&mddev->kobj, &raid5_attrs_group))
+ else if (mddev->kobj.sd &&
+ sysfs_create_group(&mddev->kobj, &raid5_attrs_group))
printk(KERN_WARNING
"md/raid:%s: failed to create sysfs attributes.\n",
mdname(mddev));
@@ -5451,10 +5452,7 @@ static int raid5_start_reshape(mddev_t *mddev)
sprintf(nm, "rd%d", rdev->raid_disk);
if (sysfs_create_link(&mddev->kobj,
&rdev->kobj, nm))
- printk(KERN_WARNING
- "md/raid:%s: failed to create "
- " link %s\n",
- mdname(mddev), nm);
+ /* Failure here is OK */;
} else
break;
}
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 02/24] md/raid5: factor out code for changing size of stripe cache.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
2010-06-01 9:56 ` [PATCH 01/24] md: reduce dependence on sysfs NeilBrown
2010-06-01 9:56 ` [PATCH 03/24] md/raid5: ensure we create a unique name for kmem_cache when mddev has no gendisk NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 08/24] dm-raid456: add support for raising events to userspace NeilBrown
` (20 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
Separate the actual 'change' code from the sysfs interface
so that it can eventually be called internally.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/raid5.c | 38 +++++++++++++++++++++++++-------------
1 files changed, 25 insertions(+), 13 deletions(-)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 8882da3..8647705 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4480,23 +4480,15 @@ raid5_show_stripe_cache_size(mddev_t *mddev, char *page)
return 0;
}
-static ssize_t
-raid5_store_stripe_cache_size(mddev_t *mddev, const char *page, size_t len)
+static int
+raid5_set_cache_size(mddev_t *mddev, int size)
{
raid5_conf_t *conf = mddev->private;
- unsigned long new;
int err;
- if (len >= PAGE_SIZE)
+ if (size <= 16 || size > 32768)
return -EINVAL;
- if (!conf)
- return -ENODEV;
-
- if (strict_strtoul(page, 10, &new))
- return -EINVAL;
- if (new <= 16 || new > 32768)
- return -EINVAL;
- while (new < conf->max_nr_stripes) {
+ while (size < conf->max_nr_stripes) {
if (drop_one_stripe(conf))
conf->max_nr_stripes--;
else
@@ -4505,11 +4497,31 @@ raid5_store_stripe_cache_size(mddev_t *mddev, const char *page, size_t len)
err = md_allow_write(mddev);
if (err)
return err;
- while (new > conf->max_nr_stripes) {
+ while (size > conf->max_nr_stripes) {
if (grow_one_stripe(conf))
conf->max_nr_stripes++;
else break;
}
+ return 0;
+}
+
+static ssize_t
+raid5_store_stripe_cache_size(mddev_t *mddev, const char *page, size_t len)
+{
+ raid5_conf_t *conf = mddev->private;
+ unsigned long new;
+ int err;
+
+ if (len >= PAGE_SIZE)
+ return -EINVAL;
+ if (!conf)
+ return -ENODEV;
+
+ if (strict_strtoul(page, 10, &new))
+ return -EINVAL;
+ err = raid5_set_cache_size(mddev, new);
+ if (err)
+ return err;
return len;
}
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 03/24] md/raid5: ensure we create a unique name for kmem_cache when mddev has no gendisk
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
2010-06-01 9:56 ` [PATCH 01/24] md: reduce dependence on sysfs NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 02/24] md/raid5: factor out code for changing size of stripe cache NeilBrown
` (21 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
We will shortly allow md devices with no gendisk (they are attached to
a dm-target instead). That will cause mdname() to return 'mdX'.
There is one place where mdname really needs to be unique: when
creating the name for a slab cache.
So in that case, if there is no gendisk, you the address of the mddev
formatted in HEX to provide a unique name.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/raid5.c | 12 ++++++++----
drivers/md/raid5.h | 2 +-
2 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 8647705..74e36a2 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -1269,10 +1269,14 @@ static int grow_stripes(raid5_conf_t *conf, int num)
struct kmem_cache *sc;
int devs = max(conf->raid_disks, conf->previous_raid_disks);
- sprintf(conf->cache_name[0],
- "raid%d-%s", conf->level, mdname(conf->mddev));
- sprintf(conf->cache_name[1],
- "raid%d-%s-alt", conf->level, mdname(conf->mddev));
+ if (conf->mddev->gendisk)
+ sprintf(conf->cache_name[0],
+ "raid%d-%s", conf->level, mdname(conf->mddev));
+ else
+ sprintf(conf->cache_name[0],
+ "raid%d-%p", conf->level, conf->mddev);
+ sprintf(conf->cache_name[1], "%s-alt", conf->cache_name[0]);
+
conf->active_name = 0;
sc = kmem_cache_create(conf->cache_name[conf->active_name],
sizeof(struct stripe_head)+(devs-1)*sizeof(struct r5dev),
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index 0f86f5e..bb7ab92 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -388,7 +388,7 @@ struct raid5_private_data {
* two caches.
*/
int active_name;
- char cache_name[2][20];
+ char cache_name[2][32];
struct kmem_cache *slab_cache; /* for allocating stripes */
int seq_flush, seq_write;
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 04/24] md: be more careful setting MD_CHANGE_CLEAN
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (6 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 10/24] dm-raid456: add congestion checking NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 05/24] md: split out md_rdev_init NeilBrown
` (15 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
When MD_CHANGE_CLEAN is set we might block in md_write_start.
So we should only set it when fairly sure that something will clear
it.
There are two places where it is set so as to encourage a metadata
update to record the progress of resync/recovery. This should only
be done if the internal metadata update mechanisms are in use, which
can be tested by by inspecting '->persistent'.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/bitmap.c | 3 ++-
drivers/md/md.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 1742435..4518994 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -1526,7 +1526,8 @@ void bitmap_cond_end_sync(struct bitmap *bitmap, sector_t sector)
atomic_read(&bitmap->mddev->recovery_active) == 0);
bitmap->mddev->curr_resync_completed = bitmap->mddev->curr_resync;
- set_bit(MD_CHANGE_CLEAN, &bitmap->mddev->flags);
+ if (bitmap->mddev->persistent)
+ set_bit(MD_CHANGE_CLEAN, &bitmap->mddev->flags);
sector &= ~((1ULL << CHUNK_BLOCK_SHIFT(bitmap)) - 1);
s = 0;
while (s < sector && s < bitmap->mddev->resync_max_sectors) {
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 6fbd1c7..2337cb2 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6726,7 +6726,8 @@ void md_do_sync(mddev_t *mddev)
atomic_read(&mddev->recovery_active) == 0);
mddev->curr_resync_completed =
mddev->curr_resync;
- set_bit(MD_CHANGE_CLEAN, &mddev->flags);
+ if (mddev->persistent)
+ set_bit(MD_CHANGE_CLEAN, &mddev->flags);
sysfs_notify(&mddev->kobj, NULL, "sync_completed");
}
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 05/24] md: split out md_rdev_init
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (7 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 04/24] md: be more careful setting MD_CHANGE_CLEAN NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 12/24] md/plug: optionally use plugger to unplug an array during resync/recovery NeilBrown
` (14 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
This functionality will be needed separately in a subsequent patch, so
split it into it's own exported function.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/md.c | 34 +++++++++++++++++++---------------
drivers/md/md.h | 1 +
2 files changed, 20 insertions(+), 15 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 2337cb2..d8e8a8c 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2690,6 +2690,24 @@ static struct kobj_type rdev_ktype = {
.default_attrs = rdev_default_attrs,
};
+void md_rdev_init(mdk_rdev_t *rdev)
+{
+ rdev->desc_nr = -1;
+ rdev->saved_raid_disk = -1;
+ rdev->raid_disk = -1;
+ rdev->flags = 0;
+ rdev->data_offset = 0;
+ rdev->sb_events = 0;
+ rdev->last_read_error.tv_sec = 0;
+ rdev->last_read_error.tv_nsec = 0;
+ atomic_set(&rdev->nr_pending, 0);
+ atomic_set(&rdev->read_errors, 0);
+ atomic_set(&rdev->corrected_errors, 0);
+
+ INIT_LIST_HEAD(&rdev->same_set);
+ init_waitqueue_head(&rdev->blocked_wait);
+}
+EXPORT_SYMBOL_GPL(md_rdev_init);
/*
* Import a device. If 'super_format' >= 0, then sanity check the superblock
*
@@ -2713,6 +2731,7 @@ static mdk_rdev_t *md_import_device(dev_t newdev, int super_format, int super_mi
return ERR_PTR(-ENOMEM);
}
+ md_rdev_init(rdev);
if ((err = alloc_disk_sb(rdev)))
goto abort_free;
@@ -2722,18 +2741,6 @@ static mdk_rdev_t *md_import_device(dev_t newdev, int super_format, int super_mi
kobject_init(&rdev->kobj, &rdev_ktype);
- rdev->desc_nr = -1;
- rdev->saved_raid_disk = -1;
- rdev->raid_disk = -1;
- rdev->flags = 0;
- rdev->data_offset = 0;
- rdev->sb_events = 0;
- rdev->last_read_error.tv_sec = 0;
- rdev->last_read_error.tv_nsec = 0;
- atomic_set(&rdev->nr_pending, 0);
- atomic_set(&rdev->read_errors, 0);
- atomic_set(&rdev->corrected_errors, 0);
-
size = rdev->bdev->bd_inode->i_size >> BLOCK_SIZE_BITS;
if (!size) {
printk(KERN_WARNING
@@ -2762,9 +2769,6 @@ static mdk_rdev_t *md_import_device(dev_t newdev, int super_format, int super_mi
}
}
- INIT_LIST_HEAD(&rdev->same_set);
- init_waitqueue_head(&rdev->blocked_wait);
-
return rdev;
abort_free:
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 1f680e4..a9cde80 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -484,4 +484,5 @@ extern void md_integrity_add_rdev(mdk_rdev_t *rdev, mddev_t *mddev);
extern int strict_strtoul_scaled(const char *cp, unsigned long *res, int scale);
extern void restore_bitmap_write_access(struct file *file);
+extern void md_rdev_init(mdk_rdev_t *rdev);
#endif /* _MD_MD_H */
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 06/24] md: export various start/stop interfaces
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (3 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 08/24] dm-raid456: add support for raising events to userspace NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 14/24] dm-raid456: add support for setting IO hints NeilBrown
` (18 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
export entry points for starting and stopping md arrays.
This will be used by a module to make md/raid5 work under
dm.
Also stop calling md_stop_writes from md_stop, as that won't
work well with dm - it will want to call the two separately.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/md.c | 15 +++++++++------
drivers/md/md.h | 4 ++++
2 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index d8e8a8c..4eccf4e 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -416,7 +416,7 @@ static void mddev_put(mddev_t *mddev)
spin_unlock(&all_mddevs_lock);
}
-static void mddev_init(mddev_t *mddev)
+void mddev_init(mddev_t *mddev)
{
mutex_init(&mddev->open_mutex);
mutex_init(&mddev->reconfig_mutex);
@@ -436,6 +436,7 @@ static void mddev_init(mddev_t *mddev)
mddev->resync_max = MaxSector;
mddev->level = LEVEL_NONE;
}
+EXPORT_SYMBOL_GPL(mddev_init);
static mddev_t * mddev_find(dev_t unit)
{
@@ -4307,7 +4308,7 @@ static void md_safemode_timeout(unsigned long data)
static int start_dirty_degraded;
-static int md_run(mddev_t *mddev)
+int md_run(mddev_t *mddev)
{
int err;
mdk_rdev_t *rdev;
@@ -4511,6 +4512,7 @@ static int md_run(mddev_t *mddev)
sysfs_notify(&mddev->kobj, NULL, "degraded");
return 0;
}
+EXPORT_SYMBOL_GPL(md_run);
static int do_md_run(mddev_t *mddev)
{
@@ -4620,7 +4622,7 @@ static void md_clean(mddev_t *mddev)
mddev->bitmap_info.max_write_behind = 0;
}
-static void md_stop_writes(mddev_t *mddev)
+void md_stop_writes(mddev_t *mddev)
{
if (mddev->sync_thread) {
set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
@@ -4640,11 +4642,10 @@ static void md_stop_writes(mddev_t *mddev)
md_update_sb(mddev, 1);
}
}
+EXPORT_SYMBOL_GPL(md_stop_writes);
-static void md_stop(mddev_t *mddev)
+void md_stop(mddev_t *mddev)
{
- md_stop_writes(mddev);
-
mddev->pers->stop(mddev);
if (mddev->pers->sync_request && mddev->to_remove == NULL)
mddev->to_remove = &md_redundancy_group;
@@ -4652,6 +4653,7 @@ static void md_stop(mddev_t *mddev)
mddev->pers = NULL;
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
}
+EXPORT_SYMBOL_GPL(md_stop);
static int md_set_readonly(mddev_t *mddev, int is_open)
{
@@ -4698,6 +4700,7 @@ static int do_md_stop(mddev_t * mddev, int mode, int is_open)
if (mddev->ro)
set_disk_ro(disk, 0);
+ md_stop_writes(mddev);
md_stop(mddev);
mddev->queue->merge_bvec_fn = NULL;
mddev->queue->unplug_fn = NULL;
diff --git a/drivers/md/md.h b/drivers/md/md.h
index a9cde80..8e19e86 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -484,5 +484,9 @@ extern void md_integrity_add_rdev(mdk_rdev_t *rdev, mddev_t *mddev);
extern int strict_strtoul_scaled(const char *cp, unsigned long *res, int scale);
extern void restore_bitmap_write_access(struct file *file);
+extern void mddev_init(mddev_t *mddev);
+extern int md_run(mddev_t *mddev);
+extern void md_stop(mddev_t *mddev);
+extern void md_stop_writes(mddev_t *mddev);
extern void md_rdev_init(mdk_rdev_t *rdev);
#endif /* _MD_MD_H */
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 08/24] dm-raid456: add support for raising events to userspace.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (2 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 02/24] md/raid5: factor out code for changing size of stripe cache NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 06/24] md: export various start/stop interfaces NeilBrown
` (19 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
Userspace needs to know about failure events. DM handles
though through the DM_DEV_WAIT_CMD ioctl.
So allow md_error to be given some work to do on an error,
and arrange that work to signal dm.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/dm-raid456.c | 8 ++++++++
drivers/md/md.c | 2 ++
drivers/md/md.h | 1 +
3 files changed, 11 insertions(+), 0 deletions(-)
diff --git a/drivers/md/dm-raid456.c b/drivers/md/dm-raid456.c
index 5185a8f..d54f901 100644
--- a/drivers/md/dm-raid456.c
+++ b/drivers/md/dm-raid456.c
@@ -139,6 +139,13 @@ static int dev_parms(struct raid_set *rs, char **argv)
return 0;
}
+static void do_table_event(struct work_struct *ws)
+{
+ struct raid_set *rs = container_of(ws, struct raid_set,
+ md.event_work);
+ dm_table_event(rs->ti->table);
+}
+
/*
* Construct a RAID4/5/6 mapping:
* Args:
@@ -290,6 +297,7 @@ static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv)
if (rs->md.raid_disks - in_sync > rt->parity_devs)
goto err;
+ INIT_WORK(&rs->md.event_work, do_table_event);
ti->split_io = rs->md.chunk_sectors;
ti->private = rs;
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4eccf4e..da860f7 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6039,6 +6039,8 @@ void md_error(mddev_t *mddev, mdk_rdev_t *rdev)
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
+ if (mddev->event_work.func)
+ schedule_work(&mddev->event_work);
md_new_event_inintr(mddev);
}
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 8e19e86..6318175 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -315,6 +315,7 @@ struct mddev_s
struct bio *barrier;
atomic_t flush_pending;
struct work_struct barrier_work;
+ struct work_struct event_work;
};
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 07/24] md/dm: create dm-raid456 module using md/raid5
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (9 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 12/24] md/plug: optionally use plugger to unplug an array during resync/recovery NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 11/24] md/raid5: add simple plugging infrastructure NeilBrown
` (12 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/Kconfig | 8 +
drivers/md/Makefile | 1
drivers/md/dm-raid456.c | 437 +++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 446 insertions(+), 0 deletions(-)
create mode 100644 drivers/md/dm-raid456.c
diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index 4a6feac..3465363 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -256,6 +256,14 @@ config DM_MIRROR
Allow volume managers to mirror logical volumes, also
needed for live data migration tools such as 'pvmove'.
+config DM_RAID456
+ tristate "RAID 4/5/6 target (EXPERIMENTAL)"
+ depends on BLK_DEV_DM && MD_RAID456 && EXPERIMENTAL
+ ---help---
+ A dm target that supports RAID4 RAID5 and RAID6 mapping
+
+ If unsure, say N.
+
config DM_LOG_USERSPACE
tristate "Mirror userspace logging (EXPERIMENTAL)"
depends on DM_MIRROR && EXPERIMENTAL && NET
diff --git a/drivers/md/Makefile b/drivers/md/Makefile
index e355e7f..0734fba 100644
--- a/drivers/md/Makefile
+++ b/drivers/md/Makefile
@@ -44,6 +44,7 @@ obj-$(CONFIG_DM_SNAPSHOT) += dm-snapshot.o
obj-$(CONFIG_DM_MIRROR) += dm-mirror.o dm-log.o dm-region-hash.o
obj-$(CONFIG_DM_LOG_USERSPACE) += dm-log-userspace.o
obj-$(CONFIG_DM_ZERO) += dm-zero.o
+obj-$(CONFIG_DM_RAID456) += dm-raid456.o
quiet_cmd_unroll = UNROLL $@
cmd_unroll = $(AWK) -f$(srctree)/$(src)/unroll.awk -vN=$(UNROLL) \
diff --git a/drivers/md/dm-raid456.c b/drivers/md/dm-raid456.c
new file mode 100644
index 0000000..5185a8f
--- /dev/null
+++ b/drivers/md/dm-raid456.c
@@ -0,0 +1,437 @@
+
+/*
+ * dm-raid456 - implemented as wrapper for md/raid456
+ *
+ */
+#include <linux/slab.h>
+#include "md.h"
+#include "raid5.h"
+#include "dm.h"
+
+struct raid_dev {
+ struct dm_dev *dev;
+ struct mdk_rdev_s rdev;
+};
+
+struct raid_set {
+ struct dm_target *ti;
+ struct mddev_s md;
+ struct raid_type *raid_type;
+ struct raid_dev dev[0];
+};
+
+/* Supported raid types and properties. */
+static struct raid_type {
+ const char *name; /* RAID algorithm. */
+ const char *descr; /* Descriptor text for logging. */
+ const unsigned parity_devs; /* # of parity devices. */
+ const unsigned minimal_devs; /* minimal # of devices in set. */
+ const unsigned level; /* RAID level. */
+ const unsigned algorithm; /* RAID algorithm. */
+} raid_types[] = {
+ {"raid4", "RAID4 (dedicated parity disk)", 1, 2, 5, ALGORITHM_PARITY_0},
+ {"raid5_la", "RAID5 (left asymmetric)", 1, 2, 5, ALGORITHM_LEFT_ASYMMETRIC},
+ {"raid5_ra", "RAID5 (right asymmetric)", 1, 2, 5, ALGORITHM_RIGHT_ASYMMETRIC},
+ {"raid5_ls", "RAID5 (left symmetric)", 1, 2, 5, ALGORITHM_LEFT_SYMMETRIC},
+ {"raid5_rs", "RAID5 (right symmetric)", 1, 2, 5, ALGORITHM_RIGHT_SYMMETRIC},
+ {"raid6_zr", "RAID6 (zero restart)", 2, 4, 6, ALGORITHM_ROTATING_ZERO_RESTART },
+ {"raid6_nr", "RAID6 (N restart)", 2, 4, 6, ALGORITHM_ROTATING_N_RESTART},
+ {"raid6_nc", "RAID6 (N continue)", 2, 4, 6, ALGORITHM_ROTATING_N_CONTINUE}
+};
+
+static struct raid_type *get_raid_type(char *name)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(raid_types); i++)
+ if (strcmp(raid_types[i].name, name) == 0)
+ return &raid_types[i];
+ return NULL;
+}
+
+static struct raid_set *
+context_alloc(struct raid_type *raid_type,
+ unsigned long chunk_size,
+ int recovery,
+ long raid_devs, sector_t sectors_per_dev,
+ struct dm_target *ti)
+{
+ struct raid_set *rs;
+
+ rs = kzalloc(sizeof(*rs) + raid_devs * sizeof(rs->dev[0]),
+ GFP_KERNEL);
+ if (!rs) {
+ ti->error = "Cannot allocate raid context";
+ return ERR_PTR(-ENOMEM);
+ }
+
+ mddev_init(&rs->md);
+
+ rs->ti = ti;
+ rs->raid_type = raid_type;
+ rs->md.raid_disks = raid_devs;
+ rs->md.level = raid_type->level;
+ rs->md.dev_sectors = sectors_per_dev;
+ rs->md.persistent = 0;
+ rs->md.external = 1;
+ rs->md.layout = raid_type->algorithm;
+ rs->md.chunk_sectors = chunk_size;
+ rs->md.recovery_cp = recovery ? 0 : MaxSector;
+
+ rs->md.new_level = rs->md.level;
+ rs->md.new_chunk_sectors = rs->md.chunk_sectors;
+ rs->md.new_layout = rs->md.layout;
+ rs->md.delta_disks = 0;
+
+ return rs;
+}
+
+static void context_free(struct raid_set *rs)
+{
+ int i;
+ for (i = 0; i < rs->md.raid_disks; i++)
+ if (rs->dev[i].dev)
+ dm_put_device(rs->ti, rs->dev[i].dev);
+ kfree(rs);
+}
+
+/* For every device we have two words
+ * device name, or "-" if missing
+ * offset from start of devices, in sectors
+ *
+ * This code parses those words.
+ */
+static int dev_parms(struct raid_set *rs, char **argv)
+{
+ int i;
+
+ for (i = 0; i < rs->md.raid_disks; i++, argv += 2) {
+ int err = 0;
+ unsigned long long offset;
+
+ md_rdev_init(&rs->dev[i].rdev);
+ rs->dev[i].rdev.raid_disk = i;
+
+ if (strcmp(argv[0], "-") == 0)
+ rs->dev[i].dev = NULL;
+ else
+ err = dm_get_device(rs->ti, argv[0],
+ dm_table_get_mode(rs->ti->table),
+ &rs->dev[i].dev);
+ if (err) {
+ rs->ti->error = "RAID device lookup failure";
+ return err;
+ }
+ if (strict_strtoull(argv[1], 10, &offset) < 0) {
+ rs->ti->error = "RAID device offset is bad";
+ return -EINVAL;
+ }
+ rs->dev[i].rdev.data_offset = offset;
+
+ set_bit(In_sync, &rs->dev[i].rdev.flags);
+
+ rs->dev[i].rdev.mddev = &rs->md;
+ if (rs->dev[i].dev) {
+ rs->dev[i].rdev.bdev = rs->dev[i].dev->bdev;
+ list_add(&rs->dev[i].rdev.same_set, &rs->md.disks);
+ }
+ }
+ return 0;
+}
+
+/*
+ * Construct a RAID4/5/6 mapping:
+ * Args:
+ * log_type #log_params <log_params> \
+ * raid_type #raid_params <raid_params> \
+ * rebuild-drive-A [rebuild-drive-B] \
+ * #raid_devs { <dev_path> <offset> }
+ * (a missing device is identified by dev_path == "-")
+ *
+ * log_type must be 'core'. We ignore region_size and use sync/nosync to
+ * decide if a resync is needed.
+ * raid_type is from "raid_types" above
+ * There are as many 'rebuild-drives' as 'parity_devs' in the raid_type.
+ * -1 means no drive needs rebuilding.
+ * raid_params are:
+ * chunk_size - in sectors, must be power of 2
+ */
+static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv)
+{
+ char *err = NULL;
+ int errnum = -EINVAL;
+ unsigned long cnt;
+ struct raid_type *rt;
+ unsigned long chunk_size;
+ int recovery = 1;
+ long raid_devs;
+ long rebuildA, rebuildB;
+ sector_t sectors_per_dev, chunks;
+ struct raid_set *rs = NULL;
+ int in_sync, i;
+
+ /* log type - core XXX [no]sync */
+ err = "Cannot parse log type";
+ if (argc < 2 ||
+ strcmp(argv[0], "core") != 0 ||
+ strict_strtoul(argv[1], 10, &cnt) < 0 ||
+ cnt + 2 > argc)
+ goto err;
+ if (cnt >= 2 && strcmp(argv[3], "sync") == 0)
+ recovery = 0;
+ argc -= cnt+2;
+ argv += cnt+2;
+
+ /* raid type */
+ err = "Cannot find raid_type";
+ if (argc < 1 ||
+ (rt = get_raid_type(argv[0])) == NULL)
+ goto err;
+ argc--; argv++;
+
+ /* number of parameters */
+ err = "Cannot understand number of RAID parameters";
+ if (argc < 1 ||
+ strict_strtoul(argv[0], 10, &cnt) < 0 ||
+ cnt + 1 > argc)
+ goto err;
+ argc--; argv++;
+
+ /* chunk size */
+ if (cnt) {
+ err = "Bad chunk size";
+ if (strict_strtoul(argv[0], 10, &chunk_size) < 0
+ || !is_power_of_2(chunk_size)
+ || chunk_size < 8
+ )
+ goto err;
+ cnt--; argc--; argv++;
+ }
+ /* Skip any extra args */
+ argc -= cnt;
+ argv += cnt;
+
+ /* drives needing rebuild */
+ err = "Cannot parse rebuild-drives";
+ if (argc < 1 ||
+ strict_strtol(argv[0], 10, &rebuildA) < 0)
+ goto err;
+ argc--; argv++;
+
+ rebuildB = -1;
+ if (rt->parity_devs == 2) {
+ if (argc < 1 ||
+ strict_strtol(argv[0], 10, &rebuildB) < 0)
+ goto err;
+ argc--; argv++;
+ }
+
+ /* number of raid devs */
+ err = "Bad number of raid devices";
+ if (argc < 1 ||
+ strict_strtol(argv[0], 10, &raid_devs) < 0 ||
+ raid_devs < rt->minimal_devs)
+ goto err;
+
+ err = "Bad number for rebuild device";
+ if (rebuildA < -1 || rebuildB < -1 ||
+ rebuildA >= raid_devs || rebuildB >= raid_devs)
+ goto err;
+
+ argc--; argv++;
+ err = "Wrong number of arguments for number of raid devices";
+ if (argc != raid_devs * 2)
+ goto err;
+
+ /* check the sizes all match */
+ sectors_per_dev = ti->len;
+ err = "Target length not divisible by number of data devices";
+ if (sector_div(sectors_per_dev, (raid_devs - rt->parity_devs)))
+ goto err;
+ chunks = sectors_per_dev;
+ err = "Device length not divisible by chunk_size";
+ if (sector_div(chunks, chunk_size))
+ goto err;
+
+
+ /* Now the devices: three words each */
+ rs = context_alloc(rt, chunk_size, recovery,
+ raid_devs, sectors_per_dev,
+ ti);
+ if (IS_ERR(rs))
+ return PTR_ERR(rs);
+
+ errnum = dev_parms(rs, argv);
+ if (errnum) {
+ err = ti->error;
+ goto err;
+ }
+ errnum = EINVAL;
+
+ err = "Rebuild device not present";
+ if (rebuildA >= 0) {
+ if (rs->dev[rebuildA].dev == NULL)
+ goto err;
+ clear_bit(In_sync, &rs->dev[rebuildA].rdev.flags);
+ rs->dev[rebuildA].rdev.recovery_offset = 0;
+ }
+ if (rebuildB >= 0) {
+ if (rs->dev[rebuildB].dev == NULL)
+ goto err;
+ clear_bit(In_sync, &rs->dev[rebuildB].rdev.flags);
+ rs->dev[rebuildB].rdev.recovery_offset = 0;
+ }
+ in_sync = 0;
+ for (i = 0; i < rs->md.raid_disks; i++)
+ if (rs->dev[i].dev &&
+ test_bit(In_sync, &rs->dev[i].rdev.flags))
+ in_sync++;
+ err = "Insufficient active RAID devices";
+ if (rs->md.raid_disks - in_sync > rt->parity_devs)
+ goto err;
+
+ ti->split_io = rs->md.chunk_sectors;
+ ti->private = rs;
+
+ mutex_lock(&rs->md.reconfig_mutex);
+ err = "Fail to run raid array";
+ errnum = md_run(&rs->md);
+ rs->md.in_sync = 0; /* Assume already marked dirty */
+ mutex_unlock(&rs->md.reconfig_mutex);
+
+ if (errnum)
+ goto err;
+ return 0;
+err:
+ if (rs)
+ context_free(rs);
+ ti->error = err;
+ return errnum;
+}
+
+static void raid_dtr(struct dm_target *ti)
+{
+ struct raid_set *rs = ti->private;
+
+ md_stop(&rs->md);
+ context_free(rs);
+}
+
+static int raid_map(struct dm_target *ti, struct bio *bio,
+ union map_info *map_context)
+{
+ struct raid_set *rs = ti->private;
+ mddev_t *mddev = &rs->md;
+
+ mddev->pers->make_request(mddev, bio);
+ return DM_MAPIO_SUBMITTED;
+}
+
+static int raid_status(struct dm_target *ti, status_type_t type,
+ char *result, unsigned maxlen)
+{
+ struct raid_set *rs = ti->private;
+ struct raid5_private_data *conf = rs->md.private;
+ int sz = 0;
+ int rbcnt;
+ int i;
+ sector_t sync;
+
+ switch (type) {
+ case STATUSTYPE_INFO:
+ DMEMIT("%u ", rs->md.raid_disks);
+ for (i = 0; i < rs->md.raid_disks; i++) {
+ if (rs->dev[i].dev)
+ DMEMIT("%s ", rs->dev[i].dev->name);
+ else
+ DMEMIT("- ");
+ }
+ for (i = 0; i < rs->md.raid_disks; i++) {
+ if (test_bit(Faulty, &rs->dev[i].rdev.flags))
+ DMEMIT("D");
+ else if (test_bit(In_sync, &rs->dev[i].rdev.flags))
+ DMEMIT("A");
+ else
+ DMEMIT("Ai");
+ }
+ DMEMIT(" %u ", conf->max_nr_stripes);
+ if (test_bit(MD_RECOVERY_RUNNING, &rs->md.recovery))
+ sync = rs->md.curr_resync_completed;
+ else
+ sync = rs->md.recovery_cp;
+ if (sync > rs->md.resync_max_sectors)
+ sync = rs->md.resync_max_sectors;
+ DMEMIT("%llu/%llu ",
+ (unsigned long long) sync,
+ (unsigned long long) rs->md.resync_max_sectors);
+ DMEMIT("1 core");
+
+ break;
+ case STATUSTYPE_TABLE:
+ /* The string you would use to construct this array */
+ /* Pretend to use a core log with a region size of 1 sector */
+ DMEMIT("core 2 %u %ssync ", 1,
+ rs->md.recovery_cp == MaxSector ? "" : "no");
+ DMEMIT("%s ", rs->raid_type->name);
+ DMEMIT("1 %u ", rs->md.chunk_sectors);
+
+ /* Print 1 or 2 rebuild_dev numbers */
+ rbcnt = 0;
+ for (i = 0; i < rs->md.raid_disks; i++)
+ if (rs->dev[i].dev &&
+ !test_bit(In_sync, &rs->dev[i].rdev.flags) &&
+ rbcnt < rs->raid_type->parity_devs) {
+ DMEMIT("%u ", i);
+ rbcnt++;
+ }
+ while (rbcnt < rs->raid_type->parity_devs) {
+ DMEMIT("-1 ");
+ rbcnt++;
+ }
+
+ DMEMIT("%u ", rs->md.raid_disks);
+ for (i = 0; i < rs->md.raid_disks; i++) {
+ mdk_rdev_t *rdev = &rs->dev[i].rdev;
+
+ if (rs->dev[i].dev)
+ DMEMIT("%s ", rs->dev[i].dev->name);
+ else
+ DMEMIT("- ");
+
+ DMEMIT("%llu ", (unsigned long long)rdev->data_offset);
+ }
+ break;
+ }
+ return 0;
+}
+
+static struct target_type raid_target = {
+ .name = "raid45",
+ .version = {1, 0, 0},
+ .module = THIS_MODULE,
+ .ctr = raid_ctr,
+ .dtr = raid_dtr,
+ .map = raid_map,
+ .status = raid_status,
+};
+
+static int __init dm_raid_init(void)
+{
+ int r = dm_register_target(&raid_target);
+
+ return r;
+}
+
+static void __exit dm_raid_exit(void)
+{
+ dm_unregister_target(&raid_target);
+}
+
+module_init(dm_raid_init);
+module_exit(dm_raid_exit);
+
+MODULE_DESCRIPTION(DM_NAME " raid4/5/6 target");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS("dm-raid4");
+MODULE_ALIAS("dm-raid5");
+MODULE_ALIAS("dm-raid6");
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 09/24] raid5: Don't set read-ahead when there is no queue
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (12 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 13/24] dm-raid456: support unplug NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 19/24] md/bitmap: clean up plugging calls NeilBrown
` (9 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
dm-raid456 does not provide a 'queue' for raid5 to use,
so we must make raid5 stop depending on the queue.
First: read_ahead
dm handles read-ahead adjustment fully in userspace, so
simply don't do any readahead adjustments if there is
no queue.
Also re-arrange code slightly so all the accesses to ->queue are
together.
Finally, move the blk_queue_merge_bvec function into the 'if' as
the ->split_io setting in dm-raid456 has the same effect.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/raid5.c | 32 +++++++++++++++++---------------
1 files changed, 17 insertions(+), 15 deletions(-)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 74e36a2..8839573 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -5072,16 +5072,6 @@ static int run(mddev_t *mddev)
"reshape");
}
- /* read-ahead size must cover two whole stripes, which is
- * 2 * (datadisks) * chunksize where 'n' is the number of raid devices
- */
- {
- int data_disks = conf->previous_raid_disks - conf->max_degraded;
- int stripe = data_disks *
- ((mddev->chunk_sectors << 9) / PAGE_SIZE);
- if (mddev->queue->backing_dev_info.ra_pages < 2 * stripe)
- mddev->queue->backing_dev_info.ra_pages = 2 * stripe;
- }
/* Ok, everything is just fine now */
if (mddev->to_remove == &raid5_attrs_group)
@@ -5089,8 +5079,23 @@ static int run(mddev_t *mddev)
else if (mddev->kobj.sd &&
sysfs_create_group(&mddev->kobj, &raid5_attrs_group))
printk(KERN_WARNING
- "md/raid:%s: failed to create sysfs attributes.\n",
+ "raid5: failed to create sysfs attributes for %s\n",
mdname(mddev));
+ md_set_array_sectors(mddev, raid5_size(mddev, 0, 0));
+
+ if (mddev->queue) {
+ /* read-ahead size must cover two whole stripes, which
+ * is 2 * (datadisks) * chunksize where 'n' is the
+ * number of raid devices
+ */
+ int data_disks = conf->previous_raid_disks - conf->max_degraded;
+ int stripe = data_disks *
+ ((mddev->chunk_sectors << 9) / PAGE_SIZE);
+ if (mddev->queue->backing_dev_info.ra_pages < 2 * stripe)
+ mddev->queue->backing_dev_info.ra_pages = 2 * stripe;
+
+ blk_queue_merge_bvec(mddev->queue, raid5_mergeable_bvec);
+ }
mddev->queue->queue_lock = &conf->device_lock;
@@ -5098,9 +5103,6 @@ static int run(mddev_t *mddev)
mddev->queue->backing_dev_info.congested_data = mddev;
mddev->queue->backing_dev_info.congested_fn = raid5_congested;
- md_set_array_sectors(mddev, raid5_size(mddev, 0, 0));
-
- blk_queue_merge_bvec(mddev->queue, raid5_mergeable_bvec);
chunk_size = mddev->chunk_sectors << 9;
blk_queue_io_min(mddev->queue, chunk_size);
blk_queue_io_opt(mddev->queue, chunk_size *
@@ -5523,7 +5525,7 @@ static void end_reshape(raid5_conf_t *conf)
/* read-ahead size must cover two whole stripes, which is
* 2 * (datadisks) * chunksize where 'n' is the number of raid devices
*/
- {
+ if (conf->mddev->queue) {
int data_disks = conf->raid_disks - conf->max_degraded;
int stripe = data_disks * ((conf->chunk_sectors << 9)
/ PAGE_SIZE);
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 10/24] dm-raid456: add congestion checking.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (5 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 14/24] dm-raid456: add support for setting IO hints NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 04/24] md: be more careful setting MD_CHANGE_CLEAN NeilBrown
` (16 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
dm currently implements congestion checking by checking on congestion
in each component device.
For raid456 we need to also check if the stripe cache is congested.
So add support to dm for a target to register a congestion checker,
then registered such a checker for dm-raid456.
We add support for multiple callbacks as we will need one for unplug
too.
Finally, we move the setting for congested_fn for the mddev->queue
into the "if (mddev->queue)" protected branch as it is not needed
for dm-raid456 now.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/dm-raid456.c | 13 +++++++++++++
drivers/md/dm-table.c | 15 +++++++++++++++
drivers/md/raid5.c | 22 +++++++++++++++-------
drivers/md/raid5.h | 1 +
include/linux/device-mapper.h | 12 ++++++++++++
5 files changed, 56 insertions(+), 7 deletions(-)
diff --git a/drivers/md/dm-raid456.c b/drivers/md/dm-raid456.c
index d54f901..0e3922a 100644
--- a/drivers/md/dm-raid456.c
+++ b/drivers/md/dm-raid456.c
@@ -17,6 +17,7 @@ struct raid_set {
struct dm_target *ti;
struct mddev_s md;
struct raid_type *raid_type;
+ struct target_callbacks callbacks;
struct raid_dev dev[0];
};
@@ -146,6 +147,13 @@ static void do_table_event(struct work_struct *ws)
dm_table_event(rs->ti->table);
}
+static int raid_is_congested(void *v, int bits)
+{
+ struct target_callbacks *cb = v;
+ struct raid_set *rs = container_of(cb, struct raid_set,
+ callbacks);
+ return md_raid5_congested(&rs->md, bits);
+}
/*
* Construct a RAID4/5/6 mapping:
* Args:
@@ -309,6 +317,10 @@ static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv)
if (errnum)
goto err;
+
+ rs->callbacks.congested_fn = raid_is_congested;
+ dm_table_add_callbacks(ti->table, &rs->callbacks);
+
return 0;
err:
if (rs)
@@ -321,6 +333,7 @@ static void raid_dtr(struct dm_target *ti)
{
struct raid_set *rs = ti->private;
+ list_del_init(&rs->callbacks.list);
md_stop(&rs->md);
context_free(rs);
}
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 9924ea2..b856340 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -68,6 +68,8 @@ struct dm_table {
void (*event_fn)(void *);
void *event_context;
+ struct list_head target_callbacks;
+
struct dm_md_mempools *mempools;
};
@@ -202,6 +204,7 @@ int dm_table_create(struct dm_table **result, fmode_t mode,
return -ENOMEM;
INIT_LIST_HEAD(&t->devices);
+ INIT_LIST_HEAD(&t->target_callbacks);
atomic_set(&t->holders, 0);
if (!num_targets)
@@ -1174,10 +1177,18 @@ int dm_table_resume_targets(struct dm_table *t)
return 0;
}
+void dm_table_add_callbacks(struct dm_table *t,
+ struct target_callbacks *cb)
+{
+ list_add(&cb->list, &t->target_callbacks);
+}
+EXPORT_SYMBOL_GPL(dm_table_add_callbacks);
+
int dm_table_any_congested(struct dm_table *t, int bdi_bits)
{
struct dm_dev_internal *dd;
struct list_head *devices = dm_table_get_devices(t);
+ struct target_callbacks *cb;
int r = 0;
list_for_each_entry(dd, devices, list) {
@@ -1192,6 +1203,10 @@ int dm_table_any_congested(struct dm_table *t, int bdi_bits)
bdevname(dd->dm_dev.bdev, b));
}
+ list_for_each_entry(cb, &t->target_callbacks, list)
+ if (cb->congested_fn)
+ r |= cb->congested_fn(cb, bdi_bits);
+
return r;
}
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 8839573..c0746af 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3592,17 +3592,14 @@ static void raid5_unplug_device(struct request_queue *q)
unplug_slaves(mddev);
}
-static int raid5_congested(void *data, int bits)
+int md_raid5_congested(mddev_t *mddev, int bits)
{
- mddev_t *mddev = data;
raid5_conf_t *conf = mddev->private;
/* No difference between reads and writes. Just check
* how busy the stripe_cache is
*/
- if (mddev_congested(mddev, bits))
- return 1;
if (conf->inactive_blocked)
return 1;
if (conf->quiesce)
@@ -3612,6 +3609,15 @@ static int raid5_congested(void *data, int bits)
return 0;
}
+EXPORT_SYMBOL_GPL(md_raid5_congested);
+
+static int raid5_congested(void *data, int bits)
+{
+ mddev_t *mddev = data;
+
+ return mddev_congested(mddev, bits) ||
+ md_raid5_congested(mddev, bits);
+}
/* We want read requests to align with chunks where possible,
* but write requests don't need to.
@@ -5095,13 +5101,14 @@ static int run(mddev_t *mddev)
mddev->queue->backing_dev_info.ra_pages = 2 * stripe;
blk_queue_merge_bvec(mddev->queue, raid5_mergeable_bvec);
+
+ mddev->queue->backing_dev_info.congested_data = mddev;
+ mddev->queue->backing_dev_info.congested_fn = raid5_congested;
}
mddev->queue->queue_lock = &conf->device_lock;
mddev->queue->unplug_fn = raid5_unplug_device;
- mddev->queue->backing_dev_info.congested_data = mddev;
- mddev->queue->backing_dev_info.congested_fn = raid5_congested;
chunk_size = mddev->chunk_sectors << 9;
blk_queue_io_min(mddev->queue, chunk_size);
@@ -5131,7 +5138,8 @@ static int stop(mddev_t *mddev)
md_unregister_thread(mddev->thread);
mddev->thread = NULL;
- mddev->queue->backing_dev_info.congested_fn = NULL;
+ if (mddev->queue)
+ mddev->queue->backing_dev_info.congested_fn = NULL;
blk_sync_queue(mddev->queue); /* the unplug fn references 'conf'*/
free_conf(conf);
mddev->private = NULL;
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index bb7ab92..936caf8 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -497,4 +497,5 @@ static inline int algorithm_is_DDF(int layout)
{
return layout >= 8 && layout <= 10;
}
+extern int md_raid5_congested(mddev_t *mddev, int bits);
#endif
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 1381cd9..2b0f538 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -187,6 +187,12 @@ struct dm_target {
char *error;
};
+/* Each target can link one of these into the table */
+struct target_callbacks {
+ struct list_head list;
+ congested_fn *congested_fn;
+};
+
int dm_register_target(struct target_type *t);
void dm_unregister_target(struct target_type *t);
@@ -263,6 +269,12 @@ int dm_table_add_target(struct dm_table *t, const char *type,
sector_t start, sector_t len, char *params);
/*
+ * Target_ctr should call this if they need to add any
+ * callback
+ */
+void dm_table_add_callbacks(struct dm_table *t,
+ struct target_callbacks *cb);
+/*
* Finally call this to make the table ready for use.
*/
int dm_table_complete(struct dm_table *t);
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 11/24] md/raid5: add simple plugging infrastructure.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (10 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 07/24] md/dm: create dm-raid456 module using md/raid5 NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 13/24] dm-raid456: support unplug NeilBrown
` (11 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
md/raid5 uses the plugging infrastructure provided by the block layer
and 'struct request_queue'. However when we plug raid5 under dm there
is no request queue so we cannot use that.
So create a similar infrastructure that is much lighter weight and use
it for raid5.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/md.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
drivers/md/md.h | 20 ++++++++++++++++++++
drivers/md/raid5.c | 39 +++++++++++++++++++++++++--------------
drivers/md/raid5.h | 3 +++
4 files changed, 93 insertions(+), 14 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index da860f7..e44b9c6 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -384,6 +384,51 @@ void md_barrier_request(mddev_t *mddev, struct bio *bio)
}
EXPORT_SYMBOL(md_barrier_request);
+/* Support for plugging.
+ * This mirrors the plugging support in request_queue, but does not
+ * require having a whole queue
+ */
+static void plugger_work(struct work_struct *work)
+{
+ struct plug_handle *plug =
+ container_of(work, struct plug_handle, unplug_work);
+ plug->unplug_fn(plug);
+}
+static void plugger_timeout(unsigned long data)
+{
+ struct plug_handle *plug = (void *)data;
+ kblockd_schedule_work(NULL, &plug->unplug_work);
+}
+void plugger_init(struct plug_handle *plug,
+ void (*unplug_fn)(struct plug_handle *))
+{
+ plug->unplug_flag = 0;
+ plug->unplug_fn = unplug_fn;
+ init_timer(&plug->unplug_timer);
+ plug->unplug_timer.function = plugger_timeout;
+ plug->unplug_timer.data = (unsigned long)plug;
+ INIT_WORK(&plug->unplug_work, plugger_work);
+}
+EXPORT_SYMBOL_GPL(plugger_init);
+
+void plugger_set_plug(struct plug_handle *plug)
+{
+ if (!test_and_set_bit(PLUGGED_FLAG, &plug->unplug_flag))
+ mod_timer(&plug->unplug_timer, jiffies + msecs_to_jiffies(3)+1);
+}
+EXPORT_SYMBOL_GPL(plugger_set_plug);
+
+int plugger_remove_plug(struct plug_handle *plug)
+{
+ if (test_and_clear_bit(PLUGGED_FLAG, &plug->unplug_flag)) {
+ del_timer(&plug->unplug_timer);
+ return 1;
+ } else
+ return 0;
+}
+EXPORT_SYMBOL_GPL(plugger_remove_plug);
+
+
static inline mddev_t *mddev_get(mddev_t *mddev)
{
atomic_inc(&mddev->active);
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 6318175..f210482 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -29,6 +29,26 @@
typedef struct mddev_s mddev_t;
typedef struct mdk_rdev_s mdk_rdev_t;
+/* generic plugging support - like that provided with request_queue,
+ * but does not require a request_queue
+ */
+struct plug_handle {
+ void (*unplug_fn)(struct plug_handle *);
+ struct timer_list unplug_timer;
+ struct work_struct unplug_work;
+ unsigned long unplug_flag;
+};
+#define PLUGGED_FLAG 1
+void plugger_init(struct plug_handle *plug,
+ void (*unplug_fn)(struct plug_handle *));
+void plugger_set_plug(struct plug_handle *plug);
+int plugger_remove_plug(struct plug_handle *plug);
+static inline void plugger_flush(struct plug_handle *plug)
+{
+ del_timer_sync(&plug->unplug_timer);
+ cancel_work_sync(&plug->unplug_work);
+}
+
/*
* MD's 'extended' device
*/
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c0746af..6b1802c 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -201,11 +201,11 @@ static void __release_stripe(raid5_conf_t *conf, struct stripe_head *sh)
if (test_bit(STRIPE_HANDLE, &sh->state)) {
if (test_bit(STRIPE_DELAYED, &sh->state)) {
list_add_tail(&sh->lru, &conf->delayed_list);
- blk_plug_device(conf->mddev->queue);
+ plugger_set_plug(&conf->plug);
} else if (test_bit(STRIPE_BIT_DELAY, &sh->state) &&
sh->bm_seq - conf->seq_write > 0) {
list_add_tail(&sh->lru, &conf->bitmap_list);
- blk_plug_device(conf->mddev->queue);
+ plugger_set_plug(&conf->plug);
} else {
clear_bit(STRIPE_BIT_DELAY, &sh->state);
list_add_tail(&sh->lru, &conf->handle_list);
@@ -365,7 +365,7 @@ static struct stripe_head *__find_stripe(raid5_conf_t *conf, sector_t sector,
}
static void unplug_slaves(mddev_t *mddev);
-static void raid5_unplug_device(struct request_queue *q);
+static void raid5_unplug_device(raid5_conf_t *conf);
static struct stripe_head *
get_active_stripe(raid5_conf_t *conf, sector_t sector,
@@ -395,7 +395,7 @@ get_active_stripe(raid5_conf_t *conf, sector_t sector,
< (conf->max_nr_stripes *3/4)
|| !conf->inactive_blocked),
conf->device_lock,
- raid5_unplug_device(conf->mddev->queue)
+ raid5_unplug_device(conf)
);
conf->inactive_blocked = 0;
} else
@@ -3532,7 +3532,7 @@ static void raid5_activate_delayed(raid5_conf_t *conf)
list_add_tail(&sh->lru, &conf->hold_list);
}
} else
- blk_plug_device(conf->mddev->queue);
+ plugger_set_plug(&conf->plug);
}
static void activate_bit_delay(raid5_conf_t *conf)
@@ -3573,23 +3573,33 @@ static void unplug_slaves(mddev_t *mddev)
rcu_read_unlock();
}
-static void raid5_unplug_device(struct request_queue *q)
+static void raid5_unplug_device(raid5_conf_t *conf)
{
- mddev_t *mddev = q->queuedata;
- raid5_conf_t *conf = mddev->private;
unsigned long flags;
spin_lock_irqsave(&conf->device_lock, flags);
- if (blk_remove_plug(q)) {
+ if (plugger_remove_plug(&conf->plug)) {
conf->seq_flush++;
raid5_activate_delayed(conf);
}
- md_wakeup_thread(mddev->thread);
+ md_wakeup_thread(conf->mddev->thread);
spin_unlock_irqrestore(&conf->device_lock, flags);
- unplug_slaves(mddev);
+ unplug_slaves(conf->mddev);
+}
+
+static void raid5_unplug(struct plug_handle *plug)
+{
+ raid5_conf_t *conf = container_of(plug, raid5_conf_t, plug);
+ raid5_unplug_device(conf);
+}
+
+static void raid5_unplug_queue(struct request_queue *q)
+{
+ mddev_t *mddev = q->queuedata;
+ raid5_unplug_device(mddev->private);
}
int md_raid5_congested(mddev_t *mddev, int bits)
@@ -3999,7 +4009,7 @@ static int make_request(mddev_t *mddev, struct bio * bi)
* add failed due to overlap. Flush everything
* and wait a while
*/
- raid5_unplug_device(mddev->queue);
+ raid5_unplug_device(conf);
release_stripe(sh);
schedule();
goto retry;
@@ -5089,6 +5099,7 @@ static int run(mddev_t *mddev)
mdname(mddev));
md_set_array_sectors(mddev, raid5_size(mddev, 0, 0));
+ plugger_init(&conf->plug, raid5_unplug);
if (mddev->queue) {
/* read-ahead size must cover two whole stripes, which
* is 2 * (datadisks) * chunksize where 'n' is the
@@ -5108,7 +5119,7 @@ static int run(mddev_t *mddev)
mddev->queue->queue_lock = &conf->device_lock;
- mddev->queue->unplug_fn = raid5_unplug_device;
+ mddev->queue->unplug_fn = raid5_unplug_queue;
chunk_size = mddev->chunk_sectors << 9;
blk_queue_io_min(mddev->queue, chunk_size);
@@ -5140,7 +5151,7 @@ static int stop(mddev_t *mddev)
mddev->thread = NULL;
if (mddev->queue)
mddev->queue->backing_dev_info.congested_fn = NULL;
- blk_sync_queue(mddev->queue); /* the unplug fn references 'conf'*/
+ plugger_flush(&conf->plug); /* the unplug fn references 'conf'*/
free_conf(conf);
mddev->private = NULL;
mddev->to_remove = &raid5_attrs_group;
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index 936caf8..edf90a8 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -398,6 +398,9 @@ struct raid5_private_data {
* (fresh device added).
* Cleared when a sync completes.
*/
+
+ struct plug_handle plug;
+
/* per cpu variables */
struct raid5_percpu {
struct page *spare_page; /* Used when checking P/Q in raid6 */
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 12/24] md/plug: optionally use plugger to unplug an array during resync/recovery.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (8 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 05/24] md: split out md_rdev_init NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 07/24] md/dm: create dm-raid456 module using md/raid5 NeilBrown
` (13 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
If an array doesn't have a 'queue' then md_do_sync cannot
unplug it.
In that case it will have a 'plugger', so make that available
to the mddev, and us it to unplug the array if needed.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/md.c | 15 ++++++++++++---
drivers/md/md.h | 2 ++
drivers/md/raid5.c | 1 +
3 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index e44b9c6..ea6577b 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4665,6 +4665,7 @@ static void md_clean(mddev_t *mddev)
mddev->bitmap_info.chunksize = 0;
mddev->bitmap_info.daemon_sleep = 0;
mddev->bitmap_info.max_write_behind = 0;
+ mddev->plug = NULL;
}
void md_stop_writes(mddev_t *mddev)
@@ -6597,6 +6598,14 @@ int md_allow_write(mddev_t *mddev)
}
EXPORT_SYMBOL_GPL(md_allow_write);
+static void md_unplug(mddev_t *mddev)
+{
+ if (mddev->queue)
+ blk_unplug(mddev->queue);
+ if (mddev->plug)
+ mddev->plug->unplug_fn(mddev->plug);
+}
+
#define SYNC_MARKS 10
#define SYNC_MARK_STEP (3*HZ)
void md_do_sync(mddev_t *mddev)
@@ -6775,7 +6784,7 @@ void md_do_sync(mddev_t *mddev)
>= mddev->resync_max - mddev->curr_resync_completed
)) {
/* time to update curr_resync_completed */
- blk_unplug(mddev->queue);
+ md_unplug(mddev);
wait_event(mddev->recovery_wait,
atomic_read(&mddev->recovery_active) == 0);
mddev->curr_resync_completed =
@@ -6853,7 +6862,7 @@ void md_do_sync(mddev_t *mddev)
* about not overloading the IO subsystem. (things like an
* e2fsck being done on the RAID array should execute fast)
*/
- blk_unplug(mddev->queue);
+ md_unplug(mddev);
cond_resched();
currspeed = ((unsigned long)(io_sectors-mddev->resync_mark_cnt))/2
@@ -6872,7 +6881,7 @@ void md_do_sync(mddev_t *mddev)
* this also signals 'finished resyncing' to md_stop
*/
out:
- blk_unplug(mddev->queue);
+ md_unplug(mddev);
wait_event(mddev->recovery_wait, !atomic_read(&mddev->recovery_active));
diff --git a/drivers/md/md.h b/drivers/md/md.h
index f210482..0bf8c18 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -325,6 +325,8 @@ struct mddev_s
struct list_head all_mddevs;
struct attribute_group *to_remove;
+ struct plug_handle *plug; /* if used by personality */
+
/* Generic barrier handling.
* If there is a pending barrier request, all other
* writes are blocked while the devices are flushed.
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 6b1802c..74bc7a6 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -5100,6 +5100,7 @@ static int run(mddev_t *mddev)
md_set_array_sectors(mddev, raid5_size(mddev, 0, 0));
plugger_init(&conf->plug, raid5_unplug);
+ mddev->plug = &conf->plug;
if (mddev->queue) {
/* read-ahead size must cover two whole stripes, which
* is 2 * (datadisks) * chunksize where 'n' is the
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 13/24] dm-raid456: support unplug
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (11 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 11/24] md/raid5: add simple plugging infrastructure NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 09/24] raid5: Don't set read-ahead when there is no queue NeilBrown
` (10 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
In a similar manner to congestion checking, per-target
unplug support for raid456 under dm.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/dm-raid456.c | 9 +++++++++
drivers/md/dm-table.c | 4 ++++
drivers/md/raid5.c | 11 +++++------
drivers/md/raid5.h | 1 +
include/linux/device-mapper.h | 1 +
5 files changed, 20 insertions(+), 6 deletions(-)
diff --git a/drivers/md/dm-raid456.c b/drivers/md/dm-raid456.c
index 0e3922a..46a5e29 100644
--- a/drivers/md/dm-raid456.c
+++ b/drivers/md/dm-raid456.c
@@ -154,6 +154,14 @@ static int raid_is_congested(void *v, int bits)
callbacks);
return md_raid5_congested(&rs->md, bits);
}
+static void raid_unplug(void *v)
+{
+ struct target_callbacks *cb = v;
+ struct raid_set *rs = container_of(cb, struct raid_set,
+ callbacks);
+ raid5_unplug_device(rs->md.private);
+}
+
/*
* Construct a RAID4/5/6 mapping:
* Args:
@@ -289,6 +297,7 @@ static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv)
goto err;
clear_bit(In_sync, &rs->dev[rebuildA].rdev.flags);
rs->dev[rebuildA].rdev.recovery_offset = 0;
+ rs->callbacks.unplug_fn = raid_unplug;
}
if (rebuildB >= 0) {
if (rs->dev[rebuildB].dev == NULL)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index b856340..cad4992 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1228,6 +1228,7 @@ void dm_table_unplug_all(struct dm_table *t)
{
struct dm_dev_internal *dd;
struct list_head *devices = dm_table_get_devices(t);
+ struct target_callbacks *cb;
list_for_each_entry(dd, devices, list) {
struct request_queue *q = bdev_get_queue(dd->dm_dev.bdev);
@@ -1240,6 +1241,9 @@ void dm_table_unplug_all(struct dm_table *t)
dm_device_name(t->md),
bdevname(dd->dm_dev.bdev, b));
}
+ list_for_each_entry(cb, &t->target_callbacks, list)
+ if (cb->unplug_fn)
+ cb->unplug_fn(cb);
}
struct mapped_device *dm_table_get_md(struct dm_table *t)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 74bc7a6..efcea17 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -365,7 +365,6 @@ static struct stripe_head *__find_stripe(raid5_conf_t *conf, sector_t sector,
}
static void unplug_slaves(mddev_t *mddev);
-static void raid5_unplug_device(raid5_conf_t *conf);
static struct stripe_head *
get_active_stripe(raid5_conf_t *conf, sector_t sector,
@@ -3573,7 +3572,7 @@ static void unplug_slaves(mddev_t *mddev)
rcu_read_unlock();
}
-static void raid5_unplug_device(raid5_conf_t *conf)
+void raid5_unplug_device(raid5_conf_t *conf)
{
unsigned long flags;
@@ -3589,6 +3588,7 @@ static void raid5_unplug_device(raid5_conf_t *conf)
unplug_slaves(conf->mddev);
}
+EXPORT_SYMBOL_GPL(raid5_unplug_device);
static void raid5_unplug(struct plug_handle *plug)
{
@@ -5116,11 +5116,10 @@ static int run(mddev_t *mddev)
mddev->queue->backing_dev_info.congested_data = mddev;
mddev->queue->backing_dev_info.congested_fn = raid5_congested;
- }
-
- mddev->queue->queue_lock = &conf->device_lock;
- mddev->queue->unplug_fn = raid5_unplug_queue;
+ mddev->queue->queue_lock = &conf->device_lock;
+ mddev->queue->unplug_fn = raid5_unplug_queue;
+ }
chunk_size = mddev->chunk_sectors << 9;
blk_queue_io_min(mddev->queue, chunk_size);
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index edf90a8..fa3938a 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -501,4 +501,5 @@ static inline int algorithm_is_DDF(int layout)
return layout >= 8 && layout <= 10;
}
extern int md_raid5_congested(mddev_t *mddev, int bits);
+extern void raid5_unplug_device(raid5_conf_t *conf);
#endif
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 2b0f538..5b8ac19 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -191,6 +191,7 @@ struct dm_target {
struct target_callbacks {
struct list_head list;
congested_fn *congested_fn;
+ void (*unplug_fn)(void *);
};
int dm_register_target(struct target_type *t);
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 14/24] dm-raid456: add support for setting IO hints.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (4 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 06/24] md: export various start/stop interfaces NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 10/24] dm-raid456: add congestion checking NeilBrown
` (17 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/dm-raid456.c | 33 +++++++++++++++++++++++++++++++++
drivers/md/raid5.c | 19 ++++++++++---------
2 files changed, 43 insertions(+), 9 deletions(-)
diff --git a/drivers/md/dm-raid456.c b/drivers/md/dm-raid456.c
index 46a5e29..0b72fa4 100644
--- a/drivers/md/dm-raid456.c
+++ b/drivers/md/dm-raid456.c
@@ -435,6 +435,37 @@ static int raid_status(struct dm_target *ti, status_type_t type,
return 0;
}
+static int raid_iterate_devices(struct dm_target *ti,
+ iterate_devices_callout_fn fn,
+ void *data)
+{
+ struct raid_set *rs = ti->private;
+ int ret = 0;
+ unsigned i = 0;
+
+ for (i = 0; !ret && i < rs->md.raid_disks; i++)
+ if (rs->dev[i].dev)
+ ret = fn(ti,
+ rs->dev[i].dev,
+ rs->dev[i].rdev.data_offset,
+ rs->md.dev_sectors,
+ data);
+
+ return ret;
+}
+
+static void raid_io_hints(struct dm_target *ti,
+ struct queue_limits *limits)
+{
+ struct raid_set *rs = ti->private;
+ unsigned chunk_size = rs->md.chunk_sectors << 9;
+ raid5_conf_t *conf = rs->md.private;
+
+ blk_limits_io_min(limits, chunk_size);
+ blk_limits_io_opt(limits, chunk_size *
+ (conf->raid_disks - conf->max_degraded));
+}
+
static struct target_type raid_target = {
.name = "raid45",
.version = {1, 0, 0},
@@ -443,6 +474,8 @@ static struct target_type raid_target = {
.dtr = raid_dtr,
.map = raid_map,
.status = raid_status,
+ .iterate_devices = raid_iterate_devices,
+ .io_hints = raid_io_hints,
};
static int __init dm_raid_init(void)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index efcea17..be5cab8 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4904,7 +4904,7 @@ static int only_parity(int raid_disk, int algo, int raid_disks, int max_degraded
static int run(mddev_t *mddev)
{
raid5_conf_t *conf;
- int working_disks = 0, chunk_size;
+ int working_disks = 0;
int dirty_parity_disks = 0;
mdk_rdev_t *rdev;
sector_t reshape_offset = 0;
@@ -5102,6 +5102,7 @@ static int run(mddev_t *mddev)
plugger_init(&conf->plug, raid5_unplug);
mddev->plug = &conf->plug;
if (mddev->queue) {
+ int chunk_size;
/* read-ahead size must cover two whole stripes, which
* is 2 * (datadisks) * chunksize where 'n' is the
* number of raid devices
@@ -5119,16 +5120,16 @@ static int run(mddev_t *mddev)
mddev->queue->queue_lock = &conf->device_lock;
mddev->queue->unplug_fn = raid5_unplug_queue;
- }
- chunk_size = mddev->chunk_sectors << 9;
- blk_queue_io_min(mddev->queue, chunk_size);
- blk_queue_io_opt(mddev->queue, chunk_size *
- (conf->raid_disks - conf->max_degraded));
+ chunk_size = mddev->chunk_sectors << 9;
+ blk_queue_io_min(mddev->queue, chunk_size);
+ blk_queue_io_opt(mddev->queue, chunk_size *
+ (conf->raid_disks - conf->max_degraded));
- list_for_each_entry(rdev, &mddev->disks, same_set)
- disk_stack_limits(mddev->gendisk, rdev->bdev,
- rdev->data_offset << 9);
+ list_for_each_entry(rdev, &mddev->disks, same_set)
+ disk_stack_limits(mddev->gendisk, rdev->bdev,
+ rdev->data_offset << 9);
+ }
return 0;
abort:
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 15/24] dm-raid456: add suspend/resume method
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (16 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 22/24] md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 24/24] dm-raid456: switch to use dm_dirty_log for tracking dirty regions NeilBrown
` (5 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
These just call in to the md methods.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/dm-raid456.c | 22 ++++++++++++++++++++++
drivers/md/md.c | 6 ++++--
drivers/md/md.h | 3 +++
3 files changed, 29 insertions(+), 2 deletions(-)
diff --git a/drivers/md/dm-raid456.c b/drivers/md/dm-raid456.c
index 0b72fa4..044faae 100644
--- a/drivers/md/dm-raid456.c
+++ b/drivers/md/dm-raid456.c
@@ -466,6 +466,25 @@ static void raid_io_hints(struct dm_target *ti,
(conf->raid_disks - conf->max_degraded));
}
+static void raid_presuspend(struct dm_target *ti)
+{
+ struct raid_set *rs = ti->private;
+ md_stop_writes(&rs->md);
+}
+
+static void raid_postsuspend(struct dm_target *ti)
+{
+ struct raid_set *rs = ti->private;
+ mddev_suspend(&rs->md);
+}
+
+static void raid_resume(struct dm_target *ti)
+{
+ struct raid_set *rs = ti->private;
+
+ mddev_resume(&rs->md);
+}
+
static struct target_type raid_target = {
.name = "raid45",
.version = {1, 0, 0},
@@ -476,6 +495,9 @@ static struct target_type raid_target = {
.status = raid_status,
.iterate_devices = raid_iterate_devices,
.io_hints = raid_io_hints,
+ .presuspend = raid_presuspend,
+ .postsuspend = raid_postsuspend,
+ .resume = raid_resume,
};
static int __init dm_raid_init(void)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index ea6577b..21f3c35 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -261,7 +261,7 @@ static int md_make_request(struct request_queue *q, struct bio *bio)
* Once ->stop is called and completes, the module will be completely
* unused.
*/
-static void mddev_suspend(mddev_t *mddev)
+void mddev_suspend(mddev_t *mddev)
{
BUG_ON(mddev->suspended);
mddev->suspended = 1;
@@ -269,13 +269,15 @@ static void mddev_suspend(mddev_t *mddev)
wait_event(mddev->sb_wait, atomic_read(&mddev->active_io) == 0);
mddev->pers->quiesce(mddev, 1);
}
+EXPORT_SYMBOL_GPL(mddev_suspend);
-static void mddev_resume(mddev_t *mddev)
+void mddev_resume(mddev_t *mddev)
{
mddev->suspended = 0;
wake_up(&mddev->sb_wait);
mddev->pers->quiesce(mddev, 0);
}
+EXPORT_SYMBOL_GPL(mddev_resume);
int mddev_congested(mddev_t *mddev, int bits)
{
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 0bf8c18..6f2c568 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -512,4 +512,7 @@ extern int md_run(mddev_t *mddev);
extern void md_stop(mddev_t *mddev);
extern void md_stop_writes(mddev_t *mddev);
extern void md_rdev_init(mdk_rdev_t *rdev);
+
+extern void mddev_suspend(mddev_t *mddev);
+extern void mddev_resume(mddev_t *mddev);
#endif /* _MD_MD_H */
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 16/24] dm-raid456: add message handler.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (21 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 23/24] md/bitmap: separate out loading a bitmap from initialising the structures NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 21/24] dm-dirty-log: allow log size to be different from target size NeilBrown
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
Support messages to:
- change the size of the stripe cache
- change the speed limiter on resync.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/dm-raid456.c | 26 ++++++++++++++++++++++++++
drivers/md/raid5.c | 3 ++-
drivers/md/raid5.h | 1 +
3 files changed, 29 insertions(+), 1 deletions(-)
diff --git a/drivers/md/dm-raid456.c b/drivers/md/dm-raid456.c
index 044faae..3fda954 100644
--- a/drivers/md/dm-raid456.c
+++ b/drivers/md/dm-raid456.c
@@ -485,6 +485,31 @@ static void raid_resume(struct dm_target *ti)
mddev_resume(&rs->md);
}
+/* Parse and handle a message from userspace
+ * Messages are:
+ * stripecache N (pages per devices)
+ * minspeed N (kibibytes per seconds)
+ */
+static int raid_message(struct dm_target *ti, unsigned argc, char **argv)
+{
+ struct raid_set *rs = ti->private;
+
+ if (argc == 2 && strcmp(argv[0], "stripecache") == 0) {
+ unsigned long size;
+ if (strict_strtoul(argv[1], 10, &size))
+ return -EINVAL;
+ return raid5_set_cache_size(&rs->md, size);
+ }
+ if (argc == 2 && strcmp(argv[0], "minspeed") == 0) {
+ unsigned long speed;
+ if (strict_strtoul(argv[1], 10, &speed))
+ return -EINVAL;
+ rs->md.sync_speed_min = speed;
+ return 0;
+ }
+ return -EINVAL;
+}
+
static struct target_type raid_target = {
.name = "raid45",
.version = {1, 0, 0},
@@ -498,6 +523,7 @@ static struct target_type raid_target = {
.presuspend = raid_presuspend,
.postsuspend = raid_postsuspend,
.resume = raid_resume,
+ .message = raid_message,
};
static int __init dm_raid_init(void)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index be5cab8..8ac122d 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4500,7 +4500,7 @@ raid5_show_stripe_cache_size(mddev_t *mddev, char *page)
return 0;
}
-static int
+int
raid5_set_cache_size(mddev_t *mddev, int size)
{
raid5_conf_t *conf = mddev->private;
@@ -4524,6 +4524,7 @@ raid5_set_cache_size(mddev_t *mddev, int size)
}
return 0;
}
+EXPORT_SYMBOL(raid5_set_cache_size);
static ssize_t
raid5_store_stripe_cache_size(mddev_t *mddev, const char *page, size_t len)
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index fa3938a..292a9c6 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -502,4 +502,5 @@ static inline int algorithm_is_DDF(int layout)
}
extern int md_raid5_congested(mddev_t *mddev, int bits);
extern void raid5_unplug_device(raid5_conf_t *conf);
+extern int raid5_set_cache_size(mddev_t *mddev, int size);
#endif
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 19/24] md/bitmap: clean up plugging calls.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (13 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 09/24] raid5: Don't set read-ahead when there is no queue NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 20/24] md/bitmap: optimise scanning of empty bitmaps NeilBrown
` (8 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
1/ use md_unplug in bitmap.c as we will soon be using bitmaps under
arrays with no queue attached.
2/ Don't bother plugging the queue when we set a bit in the bitmap.
The reason for this was to encourage as many bits as possible to
get set before we unplug and write stuff out.
However every personality already plugs the queue after
bitmap_startwrite either directly (raid1/raid10) or be setting
STRIPE_BIT_DELAY which causes the queue to be plugged later
(raid5).
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/bitmap.c | 3 +--
drivers/md/md.c | 2 +-
drivers/md/md.h | 1 +
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 8af4d65..3f04699 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -1296,7 +1296,7 @@ int bitmap_startwrite(struct bitmap *bitmap, sector_t offset, unsigned long sect
prepare_to_wait(&bitmap->overflow_wait, &__wait,
TASK_UNINTERRUPTIBLE);
spin_unlock_irq(&bitmap->lock);
- blk_unplug(bitmap->mddev->queue);
+ md_unplug(bitmap->mddev);
schedule();
finish_wait(&bitmap->overflow_wait, &__wait);
continue;
@@ -1306,7 +1306,6 @@ int bitmap_startwrite(struct bitmap *bitmap, sector_t offset, unsigned long sect
case 0:
bitmap_file_set_bit(bitmap, offset);
bitmap_count_page(bitmap, offset, 1);
- blk_plug_device_unlocked(bitmap->mddev->queue);
/* fall through */
case 1:
*bmc = 2;
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 21f3c35..f24efd2 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6600,7 +6600,7 @@ int md_allow_write(mddev_t *mddev)
}
EXPORT_SYMBOL_GPL(md_allow_write);
-static void md_unplug(mddev_t *mddev)
+void md_unplug(mddev_t *mddev)
{
if (mddev->queue)
blk_unplug(mddev->queue);
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 6f2c568..a97e4b0 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -506,6 +506,7 @@ extern int md_integrity_register(mddev_t *mddev);
extern void md_integrity_add_rdev(mdk_rdev_t *rdev, mddev_t *mddev);
extern int strict_strtoul_scaled(const char *cp, unsigned long *res, int scale);
extern void restore_bitmap_write_access(struct file *file);
+extern void md_unplug(mddev_t *mddev);
extern void mddev_init(mddev_t *mddev);
extern int md_run(mddev_t *mddev);
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 17/24] md/bitmap: white space clean up and similar.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (18 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 24/24] dm-raid456: switch to use dm_dirty_log for tracking dirty regions NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 18/24] md/bitmap: reduce dependence on sysfs NeilBrown
` (3 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
Fixes some whitespace problems
Fixed some checkpatch.pl complaints.
Replaced kmalloc ... memset(0), with kzalloc
Fixed an unlikely memory leak on an error path.
Reformatted a number of 'if/else' sets, sometimes
replacing goto with an else clause.
Removed some old comments and commented-out code.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/bitmap.c | 301 +++++++++++++++++++++++----------------------------
1 files changed, 135 insertions(+), 166 deletions(-)
diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 4518994..67fb32d 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -13,7 +13,6 @@
* Still to do:
*
* flush after percent set rather than just time based. (maybe both).
- * wait if count gets too high, wake when it drops to half.
*/
#include <linux/blkdev.h>
@@ -51,9 +50,6 @@
#define INJECT_FATAL_FAULT_3 0 /* undef */
#endif
-//#define DPRINTK PRINTK /* set this NULL to avoid verbose debug output */
-#define DPRINTK(x...) do { } while(0)
-
#ifndef PRINTK
# if DEBUG > 0
# define PRINTK(x...) printk(KERN_DEBUG x)
@@ -62,12 +58,11 @@
# endif
#endif
-static inline char * bmname(struct bitmap *bitmap)
+static inline char *bmname(struct bitmap *bitmap)
{
return bitmap->mddev ? mdname(bitmap->mddev) : "mdX";
}
-
/*
* just a placeholder - calls kmalloc for bitmap pages
*/
@@ -78,7 +73,7 @@ static unsigned char *bitmap_alloc_page(struct bitmap *bitmap)
#ifdef INJECT_FAULTS_1
page = NULL;
#else
- page = kmalloc(PAGE_SIZE, GFP_NOIO);
+ page = kzalloc(PAGE_SIZE, GFP_NOIO);
#endif
if (!page)
printk("%s: bitmap_alloc_page FAILED\n", bmname(bitmap));
@@ -107,7 +102,8 @@ static void bitmap_free_page(struct bitmap *bitmap, unsigned char *page)
* if we find our page, we increment the page's refcount so that it stays
* allocated while we're using it
*/
-static int bitmap_checkpage(struct bitmap *bitmap, unsigned long page, int create)
+static int bitmap_checkpage(struct bitmap *bitmap,
+ unsigned long page, int create)
__releases(bitmap->lock)
__acquires(bitmap->lock)
{
@@ -121,7 +117,6 @@ __acquires(bitmap->lock)
return -EINVAL;
}
-
if (bitmap->bp[page].hijacked) /* it's hijacked, don't try to alloc */
return 0;
@@ -131,43 +126,34 @@ __acquires(bitmap->lock)
if (!create)
return -ENOENT;
- spin_unlock_irq(&bitmap->lock);
-
/* this page has not been allocated yet */
- if ((mappage = bitmap_alloc_page(bitmap)) == NULL) {
+ spin_unlock_irq(&bitmap->lock);
+ mappage = bitmap_alloc_page(bitmap);
+ spin_lock_irq(&bitmap->lock);
+
+ if (mappage == NULL) {
PRINTK("%s: bitmap map page allocation failed, hijacking\n",
bmname(bitmap));
/* failed - set the hijacked flag so that we can use the
* pointer as a counter */
- spin_lock_irq(&bitmap->lock);
if (!bitmap->bp[page].map)
bitmap->bp[page].hijacked = 1;
- goto out;
- }
-
- /* got a page */
-
- spin_lock_irq(&bitmap->lock);
-
- /* recheck the page */
-
- if (bitmap->bp[page].map || bitmap->bp[page].hijacked) {
+ } else if (bitmap->bp[page].map ||
+ bitmap->bp[page].hijacked) {
/* somebody beat us to getting the page */
bitmap_free_page(bitmap, mappage);
return 0;
- }
+ } else {
- /* no page was in place and we have one, so install it */
+ /* no page was in place and we have one, so install it */
- memset(mappage, 0, PAGE_SIZE);
- bitmap->bp[page].map = mappage;
- bitmap->missing_pages--;
-out:
+ bitmap->bp[page].map = mappage;
+ bitmap->missing_pages--;
+ }
return 0;
}
-
/* if page is completely empty, put it back on the free list, or dealloc it */
/* if page was hijacked, unmark the flag so it might get alloced next time */
/* Note: lock should be held when calling this */
@@ -183,26 +169,15 @@ static void bitmap_checkfree(struct bitmap *bitmap, unsigned long page)
if (bitmap->bp[page].hijacked) { /* page was hijacked, undo this now */
bitmap->bp[page].hijacked = 0;
bitmap->bp[page].map = NULL;
- return;
+ } else {
+ /* normal case, free the page */
+ ptr = bitmap->bp[page].map;
+ bitmap->bp[page].map = NULL;
+ bitmap->missing_pages++;
+ bitmap_free_page(bitmap, ptr);
}
-
- /* normal case, free the page */
-
-#if 0
-/* actually ... let's not. We will probably need the page again exactly when
- * memory is tight and we are flusing to disk
- */
- return;
-#else
- ptr = bitmap->bp[page].map;
- bitmap->bp[page].map = NULL;
- bitmap->missing_pages++;
- bitmap_free_page(bitmap, ptr);
- return;
-#endif
}
-
/*
* bitmap file handling - read and write the bitmap file and its superblock
*/
@@ -220,11 +195,14 @@ static struct page *read_sb_page(mddev_t *mddev, loff_t offset,
mdk_rdev_t *rdev;
sector_t target;
+ int did_alloc = 0;
- if (!page)
+ if (!page) {
page = alloc_page(GFP_KERNEL);
- if (!page)
- return ERR_PTR(-ENOMEM);
+ if (!page)
+ return ERR_PTR(-ENOMEM);
+ did_alloc = 1;
+ }
list_for_each_entry(rdev, &mddev->disks, same_set) {
if (! test_bit(In_sync, &rdev->flags)
@@ -242,6 +220,8 @@ static struct page *read_sb_page(mddev_t *mddev, loff_t offset,
return page;
}
}
+ if (did_alloc)
+ put_page(page);
return ERR_PTR(-EIO);
}
@@ -286,49 +266,51 @@ static int write_sb_page(struct bitmap *bitmap, struct page *page, int wait)
mddev_t *mddev = bitmap->mddev;
while ((rdev = next_active_rdev(rdev, mddev)) != NULL) {
- int size = PAGE_SIZE;
- loff_t offset = mddev->bitmap_info.offset;
- if (page->index == bitmap->file_pages-1)
- size = roundup(bitmap->last_page_size,
- bdev_logical_block_size(rdev->bdev));
- /* Just make sure we aren't corrupting data or
- * metadata
- */
- if (mddev->external) {
- /* Bitmap could be anywhere. */
- if (rdev->sb_start + offset + (page->index *(PAGE_SIZE/512)) >
- rdev->data_offset &&
- rdev->sb_start + offset <
- rdev->data_offset + mddev->dev_sectors +
- (PAGE_SIZE/512))
- goto bad_alignment;
- } else if (offset < 0) {
- /* DATA BITMAP METADATA */
- if (offset
- + (long)(page->index * (PAGE_SIZE/512))
- + size/512 > 0)
- /* bitmap runs in to metadata */
- goto bad_alignment;
- if (rdev->data_offset + mddev->dev_sectors
- > rdev->sb_start + offset)
- /* data runs in to bitmap */
- goto bad_alignment;
- } else if (rdev->sb_start < rdev->data_offset) {
- /* METADATA BITMAP DATA */
- if (rdev->sb_start
- + offset
- + page->index*(PAGE_SIZE/512) + size/512
- > rdev->data_offset)
- /* bitmap runs in to data */
- goto bad_alignment;
- } else {
- /* DATA METADATA BITMAP - no problems */
- }
- md_super_write(mddev, rdev,
- rdev->sb_start + offset
- + page->index * (PAGE_SIZE/512),
- size,
- page);
+ int size = PAGE_SIZE;
+ loff_t offset = mddev->bitmap_info.offset;
+ if (page->index == bitmap->file_pages-1)
+ size = roundup(bitmap->last_page_size,
+ bdev_logical_block_size(rdev->bdev));
+ /* Just make sure we aren't corrupting data or
+ * metadata
+ */
+ if (mddev->external) {
+ /* Bitmap could be anywhere. */
+ if (rdev->sb_start + offset + (page->index
+ * (PAGE_SIZE/512))
+ > rdev->data_offset
+ &&
+ rdev->sb_start + offset
+ < (rdev->data_offset + mddev->dev_sectors
+ + (PAGE_SIZE/512)))
+ goto bad_alignment;
+ } else if (offset < 0) {
+ /* DATA BITMAP METADATA */
+ if (offset
+ + (long)(page->index * (PAGE_SIZE/512))
+ + size/512 > 0)
+ /* bitmap runs in to metadata */
+ goto bad_alignment;
+ if (rdev->data_offset + mddev->dev_sectors
+ > rdev->sb_start + offset)
+ /* data runs in to bitmap */
+ goto bad_alignment;
+ } else if (rdev->sb_start < rdev->data_offset) {
+ /* METADATA BITMAP DATA */
+ if (rdev->sb_start
+ + offset
+ + page->index*(PAGE_SIZE/512) + size/512
+ > rdev->data_offset)
+ /* bitmap runs in to data */
+ goto bad_alignment;
+ } else {
+ /* DATA METADATA BITMAP - no problems */
+ }
+ md_super_write(mddev, rdev,
+ rdev->sb_start + offset
+ + page->index * (PAGE_SIZE/512),
+ size,
+ page);
}
if (wait)
@@ -364,10 +346,9 @@ static void write_page(struct bitmap *bitmap, struct page *page, int wait)
bh = bh->b_this_page;
}
- if (wait) {
+ if (wait)
wait_event(bitmap->write_wait,
atomic_read(&bitmap->pending_writes)==0);
- }
}
if (bitmap->flags & BITMAP_WRITE_ERROR)
bitmap_file_kick(bitmap);
@@ -424,7 +405,7 @@ static struct page *read_page(struct file *file, unsigned long index,
struct buffer_head *bh;
sector_t block;
- PRINTK("read bitmap file (%dB @ %Lu)\n", (int)PAGE_SIZE,
+ PRINTK("read bitmap file (%dB @ %llu)\n", (int)PAGE_SIZE,
(unsigned long long)index << PAGE_SHIFT);
page = alloc_page(GFP_KERNEL);
@@ -478,7 +459,7 @@ static struct page *read_page(struct file *file, unsigned long index,
}
out:
if (IS_ERR(page))
- printk(KERN_ALERT "md: bitmap read error: (%dB @ %Lu): %ld\n",
+ printk(KERN_ALERT "md: bitmap read error: (%dB @ %llu): %ld\n",
(int)PAGE_SIZE,
(unsigned long long)index << PAGE_SHIFT,
PTR_ERR(page));
@@ -664,11 +645,14 @@ static int bitmap_mask_state(struct bitmap *bitmap, enum bitmap_state bits,
sb = kmap_atomic(bitmap->sb_page, KM_USER0);
old = le32_to_cpu(sb->state) & bits;
switch (op) {
- case MASK_SET: sb->state |= cpu_to_le32(bits);
- break;
- case MASK_UNSET: sb->state &= cpu_to_le32(~bits);
- break;
- default: BUG();
+ case MASK_SET:
+ sb->state |= cpu_to_le32(bits);
+ break;
+ case MASK_UNSET:
+ sb->state &= cpu_to_le32(~bits);
+ break;
+ default:
+ BUG();
}
kunmap_atomic(sb, KM_USER0);
return old;
@@ -710,12 +694,12 @@ static inline unsigned long file_page_offset(struct bitmap *bitmap, unsigned lon
static inline struct page *filemap_get_page(struct bitmap *bitmap,
unsigned long chunk)
{
- if (file_page_index(bitmap, chunk) >= bitmap->file_pages) return NULL;
+ if (file_page_index(bitmap, chunk) >= bitmap->file_pages)
+ return NULL;
return bitmap->filemap[file_page_index(bitmap, chunk)
- file_page_index(bitmap, 0)];
}
-
static void bitmap_file_unmap(struct bitmap *bitmap)
{
struct page **map, *sb_page;
@@ -766,7 +750,6 @@ static void bitmap_file_put(struct bitmap *bitmap)
}
}
-
/*
* bitmap_file_kick - if an error occurs while manipulating the bitmap file
* then it is no longer reliable, so we stop using it and we mark the file
@@ -785,7 +768,6 @@ static void bitmap_file_kick(struct bitmap *bitmap)
ptr = d_path(&bitmap->file->f_path, path,
PAGE_SIZE);
-
printk(KERN_ALERT
"%s: kicking failed bitmap file %s from array!\n",
bmname(bitmap), IS_ERR(ptr) ? "" : ptr);
@@ -803,9 +785,9 @@ static void bitmap_file_kick(struct bitmap *bitmap)
}
enum bitmap_page_attr {
- BITMAP_PAGE_DIRTY = 0, // there are set bits that need to be synced
- BITMAP_PAGE_CLEAN = 1, // there are bits that might need to be cleared
- BITMAP_PAGE_NEEDWRITE=2, // there are cleared bits that need to be synced
+ BITMAP_PAGE_DIRTY = 0, /* there are set bits that need to be synced */
+ BITMAP_PAGE_CLEAN = 1, /* there are bits that might need to be cleared */
+ BITMAP_PAGE_NEEDWRITE = 2, /* there are cleared bits that need to be synced */
};
static inline void set_page_attr(struct bitmap *bitmap, struct page *page,
@@ -840,15 +822,15 @@ static void bitmap_file_set_bit(struct bitmap *bitmap, sector_t block)
void *kaddr;
unsigned long chunk = block >> CHUNK_BLOCK_SHIFT(bitmap);
- if (!bitmap->filemap) {
+ if (!bitmap->filemap)
return;
- }
page = filemap_get_page(bitmap, chunk);
- if (!page) return;
+ if (!page)
+ return;
bit = file_page_offset(bitmap, chunk);
- /* set the bit */
+ /* set the bit */
kaddr = kmap_atomic(page, KM_USER0);
if (bitmap->flags & BITMAP_HOSTENDIAN)
set_bit(bit, kaddr);
@@ -859,7 +841,6 @@ static void bitmap_file_set_bit(struct bitmap *bitmap, sector_t block)
/* record page number so it gets flushed to disk when unplug occurs */
set_page_attr(bitmap, page, BITMAP_PAGE_DIRTY);
-
}
/* this gets called when the md device is ready to unplug its underlying
@@ -892,7 +873,7 @@ void bitmap_unplug(struct bitmap *bitmap)
wait = 1;
spin_unlock_irqrestore(&bitmap->lock, flags);
- if (dirty | need_write)
+ if (dirty || need_write)
write_page(bitmap, page, 0);
}
if (wait) { /* if any writes were performed, we need to wait on them */
@@ -905,6 +886,7 @@ void bitmap_unplug(struct bitmap *bitmap)
if (bitmap->flags & BITMAP_WRITE_ERROR)
bitmap_file_kick(bitmap);
}
+EXPORT_SYMBOL(bitmap_unplug);
static void bitmap_set_memory_bits(struct bitmap *bitmap, sector_t offset, int needed);
/* * bitmap_init_from_disk -- called at bitmap_create time to initialize
@@ -947,7 +929,6 @@ static int bitmap_init_from_disk(struct bitmap *bitmap, sector_t start)
if (!bitmap->mddev->bitmap_info.external)
bytes += sizeof(bitmap_super_t);
-
num_pages = (bytes + PAGE_SIZE - 1) / PAGE_SIZE;
if (file && i_size_read(file->f_mapping->host) < bytes) {
@@ -966,7 +947,7 @@ static int bitmap_init_from_disk(struct bitmap *bitmap, sector_t start)
/* We need 4 bits per page, rounded up to a multiple of sizeof(unsigned long) */
bitmap->filemap_attr = kzalloc(
- roundup( DIV_ROUND_UP(num_pages*4, 8), sizeof(unsigned long)),
+ roundup(DIV_ROUND_UP(num_pages*4, 8), sizeof(unsigned long)),
GFP_KERNEL);
if (!bitmap->filemap_attr)
goto err;
@@ -1021,7 +1002,7 @@ static int bitmap_init_from_disk(struct bitmap *bitmap, sector_t start)
if (outofdate) {
/*
* if bitmap is out of date, dirty the
- * whole page and write it out
+ * whole page and write it out
*/
paddr = kmap_atomic(page, KM_USER0);
memset(paddr + offset, 0xff,
@@ -1052,7 +1033,7 @@ static int bitmap_init_from_disk(struct bitmap *bitmap, sector_t start)
}
}
- /* everything went OK */
+ /* everything went OK */
ret = 0;
bitmap_mask_state(bitmap, BITMAP_STALE, MASK_UNSET);
@@ -1080,21 +1061,16 @@ void bitmap_write_all(struct bitmap *bitmap)
*/
int i;
- for (i=0; i < bitmap->file_pages; i++)
+ for (i = 0; i < bitmap->file_pages; i++)
set_page_attr(bitmap, bitmap->filemap[i],
BITMAP_PAGE_NEEDWRITE);
}
-
static void bitmap_count_page(struct bitmap *bitmap, sector_t offset, int inc)
{
sector_t chunk = offset >> CHUNK_BLOCK_SHIFT(bitmap);
unsigned long page = chunk >> PAGE_COUNTER_SHIFT;
bitmap->bp[page].count += inc;
-/*
- if (page == 0) printk("count page 0, offset %llu: %d gives %d\n",
- (unsigned long long)offset, inc, bitmap->bp[page].count);
-*/
bitmap_checkfree(bitmap, page);
}
static bitmap_counter_t *bitmap_get_counter(struct bitmap *bitmap,
@@ -1197,14 +1173,11 @@ void bitmap_daemon_work(mddev_t *mddev)
(sector_t)j << CHUNK_BLOCK_SHIFT(bitmap),
&blocks, 0);
if (bmc) {
-/*
- if (j < 100) printk("bitmap: j=%lu, *bmc = 0x%x\n", j, *bmc);
-*/
if (*bmc)
bitmap->allclean = 0;
if (*bmc == 2) {
- *bmc=1; /* maybe clear the bit next time */
+ *bmc = 1; /* maybe clear the bit next time */
set_page_attr(bitmap, page, BITMAP_PAGE_CLEAN);
} else if (*bmc == 1 && !bitmap->need_sync) {
/* we can clear the bit */
@@ -1243,7 +1216,7 @@ void bitmap_daemon_work(mddev_t *mddev)
done:
if (bitmap->allclean == 0)
- bitmap->mddev->thread->timeout =
+ bitmap->mddev->thread->timeout =
bitmap->mddev->bitmap_info.daemon_sleep;
mutex_unlock(&mddev->bitmap_info.mutex);
}
@@ -1265,7 +1238,7 @@ __acquires(bitmap->lock)
if (bitmap_checkpage(bitmap, page, create) < 0) {
csize = ((sector_t)1) << (CHUNK_BLOCK_SHIFT(bitmap));
- *blocks = csize - (offset & (csize- 1));
+ *blocks = csize - (offset & (csize - 1));
return NULL;
}
/* now locked ... */
@@ -1276,12 +1249,12 @@ __acquires(bitmap->lock)
int hi = (pageoff > PAGE_COUNTER_MASK);
csize = ((sector_t)1) << (CHUNK_BLOCK_SHIFT(bitmap) +
PAGE_COUNTER_SHIFT - 1);
- *blocks = csize - (offset & (csize- 1));
+ *blocks = csize - (offset & (csize - 1));
return &((bitmap_counter_t *)
&bitmap->bp[page].map)[hi];
} else { /* page is allocated */
csize = ((sector_t)1) << (CHUNK_BLOCK_SHIFT(bitmap));
- *blocks = csize - (offset & (csize- 1));
+ *blocks = csize - (offset & (csize - 1));
return (bitmap_counter_t *)
&(bitmap->bp[page].map[pageoff]);
}
@@ -1289,7 +1262,8 @@ __acquires(bitmap->lock)
int bitmap_startwrite(struct bitmap *bitmap, sector_t offset, unsigned long sectors, int behind)
{
- if (!bitmap) return 0;
+ if (!bitmap)
+ return 0;
if (behind) {
int bw;
@@ -1328,10 +1302,10 @@ int bitmap_startwrite(struct bitmap *bitmap, sector_t offset, unsigned long sect
continue;
}
- switch(*bmc) {
+ switch (*bmc) {
case 0:
bitmap_file_set_bit(bitmap, offset);
- bitmap_count_page(bitmap,offset, 1);
+ bitmap_count_page(bitmap, offset, 1);
blk_plug_device_unlocked(bitmap->mddev->queue);
/* fall through */
case 1:
@@ -1345,16 +1319,19 @@ int bitmap_startwrite(struct bitmap *bitmap, sector_t offset, unsigned long sect
offset += blocks;
if (sectors > blocks)
sectors -= blocks;
- else sectors = 0;
+ else
+ sectors = 0;
}
bitmap->allclean = 0;
return 0;
}
+EXPORT_SYMBOL(bitmap_startwrite);
void bitmap_endwrite(struct bitmap *bitmap, sector_t offset, unsigned long sectors,
int success, int behind)
{
- if (!bitmap) return;
+ if (!bitmap)
+ return;
if (behind) {
if (atomic_dec_and_test(&bitmap->behind_writes))
wake_up(&bitmap->behind_wait);
@@ -1391,18 +1368,20 @@ void bitmap_endwrite(struct bitmap *bitmap, sector_t offset, unsigned long secto
wake_up(&bitmap->overflow_wait);
(*bmc)--;
- if (*bmc <= 2) {
+ if (*bmc <= 2)
set_page_attr(bitmap,
filemap_get_page(bitmap, offset >> CHUNK_BLOCK_SHIFT(bitmap)),
BITMAP_PAGE_CLEAN);
- }
+
spin_unlock_irqrestore(&bitmap->lock, flags);
offset += blocks;
if (sectors > blocks)
sectors -= blocks;
- else sectors = 0;
+ else
+ sectors = 0;
}
}
+EXPORT_SYMBOL(bitmap_endwrite);
static int __bitmap_start_sync(struct bitmap *bitmap, sector_t offset, int *blocks,
int degraded)
@@ -1455,14 +1434,14 @@ int bitmap_start_sync(struct bitmap *bitmap, sector_t offset, int *blocks,
}
return rv;
}
+EXPORT_SYMBOL(bitmap_start_sync);
void bitmap_end_sync(struct bitmap *bitmap, sector_t offset, int *blocks, int aborted)
{
bitmap_counter_t *bmc;
unsigned long flags;
-/*
- if (offset == 0) printk("bitmap_end_sync 0 (%d)\n", aborted);
-*/ if (bitmap == NULL) {
+
+ if (bitmap == NULL) {
*blocks = 1024;
return;
}
@@ -1471,26 +1450,23 @@ void bitmap_end_sync(struct bitmap *bitmap, sector_t offset, int *blocks, int ab
if (bmc == NULL)
goto unlock;
/* locked */
-/*
- if (offset == 0) printk("bitmap_end sync found 0x%x, blocks %d\n", *bmc, *blocks);
-*/
if (RESYNC(*bmc)) {
*bmc &= ~RESYNC_MASK;
if (!NEEDED(*bmc) && aborted)
*bmc |= NEEDED_MASK;
else {
- if (*bmc <= 2) {
+ if (*bmc <= 2)
set_page_attr(bitmap,
filemap_get_page(bitmap, offset >> CHUNK_BLOCK_SHIFT(bitmap)),
BITMAP_PAGE_CLEAN);
- }
}
}
unlock:
spin_unlock_irqrestore(&bitmap->lock, flags);
bitmap->allclean = 0;
}
+EXPORT_SYMBOL(bitmap_end_sync);
void bitmap_close_sync(struct bitmap *bitmap)
{
@@ -1507,6 +1483,7 @@ void bitmap_close_sync(struct bitmap *bitmap)
sector += blocks;
}
}
+EXPORT_SYMBOL(bitmap_close_sync);
void bitmap_cond_end_sync(struct bitmap *bitmap, sector_t sector)
{
@@ -1537,6 +1514,7 @@ void bitmap_cond_end_sync(struct bitmap *bitmap, sector_t sector)
bitmap->last_end_sync = jiffies;
sysfs_notify(&bitmap->mddev->kobj, NULL, "sync_completed");
}
+EXPORT_SYMBOL(bitmap_cond_end_sync);
static void bitmap_set_memory_bits(struct bitmap *bitmap, sector_t offset, int needed)
{
@@ -1553,9 +1531,9 @@ static void bitmap_set_memory_bits(struct bitmap *bitmap, sector_t offset, int n
spin_unlock_irq(&bitmap->lock);
return;
}
- if (! *bmc) {
+ if (!*bmc) {
struct page *page;
- *bmc = 1 | (needed?NEEDED_MASK:0);
+ *bmc = 1 | (needed ? NEEDED_MASK : 0);
bitmap_count_page(bitmap, offset, 1);
page = filemap_get_page(bitmap, offset >> CHUNK_BLOCK_SHIFT(bitmap));
set_page_attr(bitmap, page, BITMAP_PAGE_CLEAN);
@@ -1720,9 +1698,9 @@ int bitmap_create(mddev_t *mddev)
bitmap->chunkshift = ffz(~mddev->bitmap_info.chunksize);
/* now that chunksize and chunkshift are set, we can use these macros */
- chunks = (blocks + CHUNK_BLOCK_RATIO(bitmap) - 1) >>
+ chunks = (blocks + CHUNK_BLOCK_RATIO(bitmap) - 1) >>
CHUNK_BLOCK_SHIFT(bitmap);
- pages = (chunks + PAGE_COUNTER_RATIO - 1) / PAGE_COUNTER_RATIO;
+ pages = (chunks + PAGE_COUNTER_RATIO - 1) / PAGE_COUNTER_RATIO;
BUG_ON(!pages);
@@ -1775,11 +1753,11 @@ static ssize_t
location_show(mddev_t *mddev, char *page)
{
ssize_t len;
- if (mddev->bitmap_info.file) {
+ if (mddev->bitmap_info.file)
len = sprintf(page, "file");
- } else if (mddev->bitmap_info.offset) {
+ else if (mddev->bitmap_info.offset)
len = sprintf(page, "%+lld", (long long)mddev->bitmap_info.offset);
- } else
+ else
len = sprintf(page, "none");
len += sprintf(page+len, "\n");
return len;
@@ -1868,7 +1846,7 @@ timeout_show(mddev_t *mddev, char *page)
ssize_t len;
unsigned long secs = mddev->bitmap_info.daemon_sleep / HZ;
unsigned long jifs = mddev->bitmap_info.daemon_sleep % HZ;
-
+
len = sprintf(page, "%lu", secs);
if (jifs)
len += sprintf(page+len, ".%03u", jiffies_to_msecs(jifs));
@@ -2050,12 +2028,3 @@ struct attribute_group md_bitmap_group = {
.attrs = md_bitmap_attrs,
};
-
-/* the bitmap API -- for raid personalities */
-EXPORT_SYMBOL(bitmap_startwrite);
-EXPORT_SYMBOL(bitmap_endwrite);
-EXPORT_SYMBOL(bitmap_start_sync);
-EXPORT_SYMBOL(bitmap_end_sync);
-EXPORT_SYMBOL(bitmap_unplug);
-EXPORT_SYMBOL(bitmap_close_sync);
-EXPORT_SYMBOL(bitmap_cond_end_sync);
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 20/24] md/bitmap: optimise scanning of empty bitmaps.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (14 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 19/24] md/bitmap: clean up plugging calls NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 22/24] md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log NeilBrown
` (7 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
A bitmap is stored as one page per 2048 bits.
If none of the bits are set, the page is not allocated.
When bitmap_get_counter finds that a page isn't allocate,
it just reports that one bit work of space isn't flagged,
rather than reporting that 2048 bits worth of space are
unflagged.
This can cause searches for flagged bits (e.g. bitmap_close_sync)
to do more work than is really necessary.
So change bitmap_get_counter (when creating) to report a number of
blocks that more accurately reports the range of the device for which
no counter currently exists.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/bitmap.c | 23 +++++++++++++----------
1 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 3f04699..29a3c86 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -1235,29 +1235,32 @@ __acquires(bitmap->lock)
unsigned long page = chunk >> PAGE_COUNTER_SHIFT;
unsigned long pageoff = (chunk & PAGE_COUNTER_MASK) << COUNTER_BYTE_SHIFT;
sector_t csize;
+ int err;
+
+ err = bitmap_checkpage(bitmap, page, create);
- if (bitmap_checkpage(bitmap, page, create) < 0) {
+ if (bitmap->bp[page].hijacked ||
+ bitmap->bp[page].map == NULL)
+ csize = ((sector_t)1) << (CHUNK_BLOCK_SHIFT(bitmap) +
+ PAGE_COUNTER_SHIFT - 1);
+ else
csize = ((sector_t)1) << (CHUNK_BLOCK_SHIFT(bitmap));
- *blocks = csize - (offset & (csize - 1));
+ *blocks = csize - (offset & (csize - 1));
+
+ if (err < 0)
return NULL;
- }
+
/* now locked ... */
if (bitmap->bp[page].hijacked) { /* hijacked pointer */
/* should we use the first or second counter field
* of the hijacked pointer? */
int hi = (pageoff > PAGE_COUNTER_MASK);
- csize = ((sector_t)1) << (CHUNK_BLOCK_SHIFT(bitmap) +
- PAGE_COUNTER_SHIFT - 1);
- *blocks = csize - (offset & (csize - 1));
return &((bitmap_counter_t *)
&bitmap->bp[page].map)[hi];
- } else { /* page is allocated */
- csize = ((sector_t)1) << (CHUNK_BLOCK_SHIFT(bitmap));
- *blocks = csize - (offset & (csize - 1));
+ } else /* page is allocated */
return (bitmap_counter_t *)
&(bitmap->bp[page].map[pageoff]);
- }
}
int bitmap_startwrite(struct bitmap *bitmap, sector_t offset, unsigned long sectors, int behind)
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 18/24] md/bitmap: reduce dependence on sysfs.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (19 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 17/24] md/bitmap: white space clean up and similar NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 23/24] md/bitmap: separate out loading a bitmap from initialising the structures NeilBrown
` (2 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
For dm-raid45 we will want to use bitmaps in dm-targets which don't
have entries in sysfs, so cope with the mddev not living in sysfs.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/bitmap.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 67fb32d..8af4d65 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -1358,7 +1358,7 @@ void bitmap_endwrite(struct bitmap *bitmap, sector_t offset, unsigned long secto
bitmap->events_cleared < bitmap->mddev->events) {
bitmap->events_cleared = bitmap->mddev->events;
bitmap->need_sync = 1;
- sysfs_notify_dirent(bitmap->sysfs_can_clear);
+ sysfs_notify_dirent_safe(bitmap->sysfs_can_clear);
}
if (!success && ! (*bmc & NEEDED_MASK))
@@ -1643,7 +1643,7 @@ int bitmap_create(mddev_t *mddev)
struct file *file = mddev->bitmap_info.file;
int err;
sector_t start;
- struct sysfs_dirent *bm;
+ struct sysfs_dirent *bm = NULL;
BUILD_BUG_ON(sizeof(bitmap_super_t) != 256);
@@ -1664,7 +1664,8 @@ int bitmap_create(mddev_t *mddev)
bitmap->mddev = mddev;
- bm = sysfs_get_dirent(mddev->kobj.sd, NULL, "bitmap");
+ if (mddev->kobj.sd)
+ bm = sysfs_get_dirent(mddev->kobj.sd, NULL, "bitmap");
if (bm) {
bitmap->sysfs_can_clear = sysfs_get_dirent(bm, NULL, "can_clear");
sysfs_put(bm);
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 21/24] dm-dirty-log: allow log size to be different from target size.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (22 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 16/24] dm-raid456: add message handler NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
[not found] ` <1275490641.30896.40.camel@o>
23 siblings, 1 reply; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
With RAID1, the dirty log covers the size of the target - the number
of bits is the target size divided by the region size.
For RAID5 and similar the dirty log needs to cover the parity blocks,
so it is best to base the dirty log size on the size of the component
devices rather than the size of the array.
So when creating a dirty log, allow the log_size to be specified
separately from the target, and in raid1, set it to the target length.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/dm-log-userspace-base.c | 11 +++++++----
drivers/md/dm-log.c | 18 +++++++++++-------
drivers/md/dm-raid1.c | 4 ++--
include/linux/dm-dirty-log.h | 3 ++-
4 files changed, 22 insertions(+), 14 deletions(-)
diff --git a/drivers/md/dm-log-userspace-base.c b/drivers/md/dm-log-userspace-base.c
index 1ed0094..935a49b 100644
--- a/drivers/md/dm-log-userspace-base.c
+++ b/drivers/md/dm-log-userspace-base.c
@@ -94,7 +94,7 @@ retry:
return -ESRCH;
}
-static int build_constructor_string(struct dm_target *ti,
+static int build_constructor_string(sector_t log_size,
unsigned argc, char **argv,
char **ctr_str)
{
@@ -114,7 +114,7 @@ static int build_constructor_string(struct dm_target *ti,
return -ENOMEM;
}
- str_size = sprintf(str, "%llu", (unsigned long long)ti->len);
+ str_size = sprintf(str, "%llu", (unsigned long long)log_size);
for (i = 0; i < argc; i++)
str_size += sprintf(str + str_size, " %s", argv[i]);
@@ -136,6 +136,7 @@ static int build_constructor_string(struct dm_target *ti,
* else.
*/
static int userspace_ctr(struct dm_dirty_log *log, struct dm_target *ti,
+ sector_t log_size,
unsigned argc, char **argv)
{
int r = 0;
@@ -171,7 +172,9 @@ static int userspace_ctr(struct dm_dirty_log *log, struct dm_target *ti,
spin_lock_init(&lc->flush_lock);
INIT_LIST_HEAD(&lc->flush_list);
- str_size = build_constructor_string(ti, argc - 1, argv + 1, &ctr_str);
+ str_size = build_constructor_string(log_size,
+ argc - 1, argv + 1,
+ &ctr_str);
if (str_size < 0) {
kfree(lc);
return str_size;
@@ -197,7 +200,7 @@ static int userspace_ctr(struct dm_dirty_log *log, struct dm_target *ti,
}
lc->region_size = (uint32_t)rdata;
- lc->region_count = dm_sector_div_up(ti->len, lc->region_size);
+ lc->region_count = dm_sector_div_up(log_size, lc->region_size);
out:
if (r) {
diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
index 5a08be0..a232c14 100644
--- a/drivers/md/dm-log.c
+++ b/drivers/md/dm-log.c
@@ -146,6 +146,7 @@ EXPORT_SYMBOL(dm_dirty_log_type_unregister);
struct dm_dirty_log *dm_dirty_log_create(const char *type_name,
struct dm_target *ti,
+ sector_t log_size,
int (*flush_callback_fn)(struct dm_target *ti),
unsigned int argc, char **argv)
{
@@ -164,7 +165,7 @@ struct dm_dirty_log *dm_dirty_log_create(const char *type_name,
log->flush_callback_fn = flush_callback_fn;
log->type = type;
- if (type->ctr(log, ti, argc, argv)) {
+ if (type->ctr(log, ti, log_size, argc, argv)) {
kfree(log);
put_type(type);
return NULL;
@@ -335,9 +336,9 @@ static int read_header(struct log_c *log)
return 0;
}
-static int _check_region_size(struct dm_target *ti, uint32_t region_size)
+static int _check_region_size(sector_t log_size, uint32_t region_size)
{
- if (region_size < 2 || region_size > ti->len)
+ if (region_size < 2 || region_size > log_size)
return 0;
if (!is_power_of_2(region_size))
@@ -353,6 +354,7 @@ static int _check_region_size(struct dm_target *ti, uint32_t region_size)
*--------------------------------------------------------------*/
#define BYTE_SHIFT 3
static int create_log_context(struct dm_dirty_log *log, struct dm_target *ti,
+ sector_t log_size,
unsigned int argc, char **argv,
struct dm_dev *dev)
{
@@ -382,12 +384,12 @@ static int create_log_context(struct dm_dirty_log *log, struct dm_target *ti,
}
if (sscanf(argv[0], "%u", ®ion_size) != 1 ||
- !_check_region_size(ti, region_size)) {
+ !_check_region_size(log_size, region_size)) {
DMWARN("invalid region size %s", argv[0]);
return -EINVAL;
}
- region_count = dm_sector_div_up(ti->len, region_size);
+ region_count = dm_sector_div_up(log_size, region_size);
lc = kmalloc(sizeof(*lc), GFP_KERNEL);
if (!lc) {
@@ -507,9 +509,10 @@ static int create_log_context(struct dm_dirty_log *log, struct dm_target *ti,
}
static int core_ctr(struct dm_dirty_log *log, struct dm_target *ti,
+ sector_t log_size,
unsigned int argc, char **argv)
{
- return create_log_context(log, ti, argc, argv, NULL);
+ return create_log_context(log, ti, log_size, argc, argv, NULL);
}
static void destroy_log_context(struct log_c *lc)
@@ -533,6 +536,7 @@ static void core_dtr(struct dm_dirty_log *log)
* argv contains log_device region_size followed optionally by [no]sync
*--------------------------------------------------------------*/
static int disk_ctr(struct dm_dirty_log *log, struct dm_target *ti,
+ sector_t log_size,
unsigned int argc, char **argv)
{
int r;
@@ -547,7 +551,7 @@ static int disk_ctr(struct dm_dirty_log *log, struct dm_target *ti,
if (r)
return r;
- r = create_log_context(log, ti, argc - 1, argv + 1, dev);
+ r = create_log_context(log, ti, log_size, argc - 1, argv + 1, dev);
if (r) {
dm_put_device(ti, dev);
return r;
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index ddda531..ea732fc 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -968,8 +968,8 @@ static struct dm_dirty_log *create_dirty_log(struct dm_target *ti,
return NULL;
}
- dl = dm_dirty_log_create(argv[0], ti, mirror_flush, param_count,
- argv + 2);
+ dl = dm_dirty_log_create(argv[0], ti, ti->len, mirror_flush,
+ param_count, argv + 2);
if (!dl) {
ti->error = "Error creating mirror dirty log";
return NULL;
diff --git a/include/linux/dm-dirty-log.h b/include/linux/dm-dirty-log.h
index 7084503..641419f 100644
--- a/include/linux/dm-dirty-log.h
+++ b/include/linux/dm-dirty-log.h
@@ -33,6 +33,7 @@ struct dm_dirty_log_type {
struct list_head list;
int (*ctr)(struct dm_dirty_log *log, struct dm_target *ti,
+ sector_t log_size,
unsigned argc, char **argv);
void (*dtr)(struct dm_dirty_log *log);
@@ -137,7 +138,7 @@ int dm_dirty_log_type_unregister(struct dm_dirty_log_type *type);
* type->constructor/destructor() directly.
*/
struct dm_dirty_log *dm_dirty_log_create(const char *type_name,
- struct dm_target *ti,
+ struct dm_target *ti, sector_t log_size,
int (*flush_callback_fn)(struct dm_target *ti),
unsigned argc, char **argv);
void dm_dirty_log_destroy(struct dm_dirty_log *log);
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 23/24] md/bitmap: separate out loading a bitmap from initialising the structures.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (20 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 18/24] md/bitmap: reduce dependence on sysfs NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 16/24] dm-raid456: add message handler NeilBrown
2010-06-01 9:56 ` [PATCH 21/24] dm-dirty-log: allow log size to be different from target size NeilBrown
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
dm makes this distinction between ->ctr and ->resume, so we need to
too.
Also get the new bitmap_load to clear out the bitmap first, as this is
most consistent with the dm suspend/resume approach
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/bitmap.c | 69 +++++++++++++++++++++++++++++++++++----------------
drivers/md/bitmap.h | 1 +
drivers/md/md.c | 13 ++++++++--
3 files changed, 60 insertions(+), 23 deletions(-)
diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 9376526..1ba1e12 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -1681,7 +1681,6 @@ int bitmap_create(mddev_t *mddev)
unsigned long pages;
struct file *file = mddev->bitmap_info.file;
int err;
- sector_t start;
struct sysfs_dirent *bm = NULL;
BUILD_BUG_ON(sizeof(bitmap_super_t) != 256);
@@ -1763,13 +1762,40 @@ int bitmap_create(mddev_t *mddev)
if (!bitmap->bp)
goto error;
- /* now that we have some pages available, initialize the in-memory
- * bitmap from the on-disk bitmap */
- start = 0;
- if (mddev->degraded == 0
- || bitmap->events_cleared == mddev->events)
- /* no need to keep dirty bits to optimise a re-add of a missing device */
- start = mddev->recovery_cp;
+ printk(KERN_INFO "created bitmap (%lu pages) for device %s\n",
+ pages, bmname(bitmap));
+
+ mddev->bitmap = bitmap;
+
+
+ return (bitmap->flags & BITMAP_WRITE_ERROR) ? -EIO : 0;
+
+ error:
+ bitmap_free(bitmap);
+ return err;
+}
+
+int bitmap_load(mddev_t *mddev)
+{
+ int err = 0;
+ sector_t sector = 0;
+ struct bitmap *bitmap = mddev->bitmap;
+
+ if (!bitmap)
+ goto out;
+
+ /* Clear out old bitmap info first: Either there is none, or we
+ * are resuming after someone else has possibly changed things,
+ * so we should forget old cached info.
+ * All chunks should be clean, but some might need_sync.
+ */
+ while (sector < mddev->resync_max_sectors) {
+ int blocks;
+ bitmap_start_sync(bitmap, sector, &blocks, 0);
+ sector += blocks;
+ }
+ bitmap_close_sync(bitmap);
+
if (mddev->bitmap_info.log) {
unsigned long i;
struct dm_dirty_log *log = mddev->bitmap_info.log;
@@ -1778,29 +1804,30 @@ int bitmap_create(mddev_t *mddev)
bitmap_set_memory_bits(bitmap,
(sector_t)i << CHUNK_BLOCK_SHIFT(bitmap),
1);
- err = 0;
- } else
- err = bitmap_init_from_disk(bitmap, start);
+ } else {
+ sector_t start = 0;
+ if (mddev->degraded == 0
+ || bitmap->events_cleared == mddev->events)
+ /* no need to keep dirty bits to optimise a
+ * re-add of a missing device */
+ start = mddev->recovery_cp;
+ err = bitmap_init_from_disk(bitmap, start);
+ }
if (err)
- goto error;
-
- printk(KERN_INFO "created bitmap (%lu pages) for device %s\n",
- pages, bmname(bitmap));
-
- mddev->bitmap = bitmap;
+ goto out;
mddev->thread->timeout = mddev->bitmap_info.daemon_sleep;
md_wakeup_thread(mddev->thread);
bitmap_update_sb(bitmap);
- return (bitmap->flags & BITMAP_WRITE_ERROR) ? -EIO : 0;
-
- error:
- bitmap_free(bitmap);
+ if (bitmap->flags & BITMAP_WRITE_ERROR)
+ err = -EIO;
+out:
return err;
}
+EXPORT_SYMBOL_GPL(bitmap_load);
static ssize_t
location_show(mddev_t *mddev, char *page)
diff --git a/drivers/md/bitmap.h b/drivers/md/bitmap.h
index a7a1113..e872a7b 100644
--- a/drivers/md/bitmap.h
+++ b/drivers/md/bitmap.h
@@ -254,6 +254,7 @@ struct bitmap {
/* these are used only by md/bitmap */
int bitmap_create(mddev_t *mddev);
+int bitmap_load(mddev_t *mddev);
void bitmap_flush(mddev_t *mddev);
void bitmap_destroy(mddev_t *mddev);
diff --git a/drivers/md/md.c b/drivers/md/md.c
index f24efd2..e0a9bf8 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4568,7 +4568,11 @@ static int do_md_run(mddev_t *mddev)
err = md_run(mddev);
if (err)
goto out;
-
+ err = bitmap_load(mddev);
+ if (err) {
+ bitmap_destroy(mddev);
+ goto out;
+ }
set_capacity(mddev->gendisk, mddev->array_sectors);
revalidate_disk(mddev->gendisk);
kobject_uevent(&disk_to_dev(mddev->gendisk)->kobj, KOBJ_CHANGE);
@@ -5356,8 +5360,11 @@ static int set_bitmap_file(mddev_t *mddev, int fd)
err = 0;
if (mddev->pers) {
mddev->pers->quiesce(mddev, 1);
- if (fd >= 0)
+ if (fd >= 0) {
err = bitmap_create(mddev);
+ if (!err)
+ err = bitmap_load(mddev);
+ }
if (fd < 0 || err) {
bitmap_destroy(mddev);
fd = -1; /* make sure to put the file */
@@ -5606,6 +5613,8 @@ static int update_array_info(mddev_t *mddev, mdu_array_info_t *info)
mddev->bitmap_info.default_offset;
mddev->pers->quiesce(mddev, 1);
rv = bitmap_create(mddev);
+ if (!rv)
+ rv = bitmap_load(mddev);
if (rv)
bitmap_destroy(mddev);
mddev->pers->quiesce(mddev, 0);
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 22/24] md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (15 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 20/24] md/bitmap: optimise scanning of empty bitmaps NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 15/24] dm-raid456: add suspend/resume method NeilBrown
` (6 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
This allows md/raid5 to fully work as a dm target.
Normally md uses a 'filemap' which contains a list of pages of bits
each of which may be written separately.
dm-log uses and all-or-nothing approach to writing the log, so
when using a dm-log, ->filemap is NULL and the flags normally stored
in filemap_attr are stored in ->logattrs instead.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/bitmap.c | 128 +++++++++++++++++++++++++++++++++++----------------
drivers/md/bitmap.h | 5 ++
drivers/md/md.h | 5 ++
3 files changed, 99 insertions(+), 39 deletions(-)
diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 29a3c86..9376526 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -29,6 +29,7 @@
#include "md.h"
#include "bitmap.h"
+#include <linux/dm-dirty-log.h>
/* debug macros */
#define DEBUG 0
@@ -694,6 +695,8 @@ static inline unsigned long file_page_offset(struct bitmap *bitmap, unsigned lon
static inline struct page *filemap_get_page(struct bitmap *bitmap,
unsigned long chunk)
{
+ if (bitmap->filemap == NULL)
+ return NULL;
if (file_page_index(bitmap, chunk) >= bitmap->file_pages)
return NULL;
return bitmap->filemap[file_page_index(bitmap, chunk)
@@ -793,19 +796,28 @@ enum bitmap_page_attr {
static inline void set_page_attr(struct bitmap *bitmap, struct page *page,
enum bitmap_page_attr attr)
{
- __set_bit((page->index<<2) + attr, bitmap->filemap_attr);
+ if (page)
+ __set_bit((page->index<<2) + attr, bitmap->filemap_attr);
+ else
+ __set_bit(attr, &bitmap->logattrs);
}
static inline void clear_page_attr(struct bitmap *bitmap, struct page *page,
enum bitmap_page_attr attr)
{
- __clear_bit((page->index<<2) + attr, bitmap->filemap_attr);
+ if (page)
+ __clear_bit((page->index<<2) + attr, bitmap->filemap_attr);
+ else
+ __clear_bit(attr, &bitmap->logattrs);
}
static inline unsigned long test_page_attr(struct bitmap *bitmap, struct page *page,
enum bitmap_page_attr attr)
{
- return test_bit((page->index<<2) + attr, bitmap->filemap_attr);
+ if (page)
+ return test_bit((page->index<<2) + attr, bitmap->filemap_attr);
+ else
+ return test_bit(attr, &bitmap->logattrs);
}
/*
@@ -818,27 +830,30 @@ static inline unsigned long test_page_attr(struct bitmap *bitmap, struct page *p
static void bitmap_file_set_bit(struct bitmap *bitmap, sector_t block)
{
unsigned long bit;
- struct page *page;
+ struct page *page = NULL;
void *kaddr;
unsigned long chunk = block >> CHUNK_BLOCK_SHIFT(bitmap);
- if (!bitmap->filemap)
- return;
-
- page = filemap_get_page(bitmap, chunk);
- if (!page)
- return;
- bit = file_page_offset(bitmap, chunk);
+ if (!bitmap->filemap) {
+ struct dm_dirty_log *log = bitmap->mddev->bitmap_info.log;
+ if (log)
+ log->type->mark_region(log, chunk);
+ } else {
- /* set the bit */
- kaddr = kmap_atomic(page, KM_USER0);
- if (bitmap->flags & BITMAP_HOSTENDIAN)
- set_bit(bit, kaddr);
- else
- ext2_set_bit(bit, kaddr);
- kunmap_atomic(kaddr, KM_USER0);
- PRINTK("set file bit %lu page %lu\n", bit, page->index);
+ page = filemap_get_page(bitmap, chunk);
+ if (!page)
+ return;
+ bit = file_page_offset(bitmap, chunk);
+ /* set the bit */
+ kaddr = kmap_atomic(page, KM_USER0);
+ if (bitmap->flags & BITMAP_HOSTENDIAN)
+ set_bit(bit, kaddr);
+ else
+ ext2_set_bit(bit, kaddr);
+ kunmap_atomic(kaddr, KM_USER0);
+ PRINTK("set file bit %lu page %lu\n", bit, page->index);
+ }
/* record page number so it gets flushed to disk when unplug occurs */
set_page_attr(bitmap, page, BITMAP_PAGE_DIRTY);
}
@@ -855,6 +870,16 @@ void bitmap_unplug(struct bitmap *bitmap)
if (!bitmap)
return;
+ if (!bitmap->filemap) {
+ /* Must be using a dirty_log */
+ struct dm_dirty_log *log = bitmap->mddev->bitmap_info.log;
+ dirty = test_and_clear_bit(BITMAP_PAGE_DIRTY, &bitmap->logattrs);
+ need_write = test_and_clear_bit(BITMAP_PAGE_NEEDWRITE, &bitmap->logattrs);
+ if (dirty || need_write)
+ if (log->type->flush(log))
+ bitmap->flags |= BITMAP_WRITE_ERROR;
+ goto out;
+ }
/* look at each page to see if there are any set bits that need to be
* flushed out to disk */
@@ -883,6 +908,7 @@ void bitmap_unplug(struct bitmap *bitmap)
else
md_super_wait(bitmap->mddev);
}
+out:
if (bitmap->flags & BITMAP_WRITE_ERROR)
bitmap_file_kick(bitmap);
}
@@ -925,11 +951,11 @@ static int bitmap_init_from_disk(struct bitmap *bitmap, sector_t start)
printk(KERN_INFO "%s: bitmap file is out of date, doing full "
"recovery\n", bmname(bitmap));
- bytes = (chunks + 7) / 8;
+ bytes = DIV_ROUND_UP(bitmap->chunks, 8);
if (!bitmap->mddev->bitmap_info.external)
bytes += sizeof(bitmap_super_t);
- num_pages = (bytes + PAGE_SIZE - 1) / PAGE_SIZE;
+ num_pages = DIV_ROUND_UP(bytes, PAGE_SIZE);
if (file && i_size_read(file->f_mapping->host) < bytes) {
printk(KERN_INFO "%s: bitmap file too short %lu < %lu\n",
@@ -1090,6 +1116,7 @@ void bitmap_daemon_work(mddev_t *mddev)
struct page *page = NULL, *lastpage = NULL;
int blocks;
void *paddr;
+ struct dm_dirty_log *log = mddev->bitmap_info.log;
/* Use a mutex to guard daemon_work against
* bitmap_destroy.
@@ -1114,11 +1141,12 @@ void bitmap_daemon_work(mddev_t *mddev)
spin_lock_irqsave(&bitmap->lock, flags);
for (j = 0; j < bitmap->chunks; j++) {
bitmap_counter_t *bmc;
- if (!bitmap->filemap)
- /* error or shutdown */
- break;
-
- page = filemap_get_page(bitmap, j);
+ if (!bitmap->filemap) {
+ if (!log)
+ /* error or shutdown */
+ break;
+ } else
+ page = filemap_get_page(bitmap, j);
if (page != lastpage) {
/* skip this page unless it's marked as needing cleaning */
@@ -1187,14 +1215,17 @@ void bitmap_daemon_work(mddev_t *mddev)
-1);
/* clear the bit */
- paddr = kmap_atomic(page, KM_USER0);
- if (bitmap->flags & BITMAP_HOSTENDIAN)
- clear_bit(file_page_offset(bitmap, j),
- paddr);
- else
- ext2_clear_bit(file_page_offset(bitmap, j),
- paddr);
- kunmap_atomic(paddr, KM_USER0);
+ if (page) {
+ paddr = kmap_atomic(page, KM_USER0);
+ if (bitmap->flags & BITMAP_HOSTENDIAN)
+ clear_bit(file_page_offset(bitmap, j),
+ paddr);
+ else
+ ext2_clear_bit(file_page_offset(bitmap, j),
+ paddr);
+ kunmap_atomic(paddr, KM_USER0);
+ } else
+ log->type->clear_region(log, j);
}
} else
j |= PAGE_COUNTER_MASK;
@@ -1202,12 +1233,16 @@ void bitmap_daemon_work(mddev_t *mddev)
spin_unlock_irqrestore(&bitmap->lock, flags);
/* now sync the final page */
- if (lastpage != NULL) {
+ if (lastpage != NULL || log != NULL) {
spin_lock_irqsave(&bitmap->lock, flags);
if (test_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE)) {
clear_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE);
spin_unlock_irqrestore(&bitmap->lock, flags);
- write_page(bitmap, lastpage, 0);
+ if (lastpage)
+ write_page(bitmap, lastpage, 0);
+ else
+ if (log->type->flush(log))
+ bitmap->flags |= BITMAP_WRITE_ERROR;
} else {
set_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE);
spin_unlock_irqrestore(&bitmap->lock, flags);
@@ -1372,7 +1407,9 @@ void bitmap_endwrite(struct bitmap *bitmap, sector_t offset, unsigned long secto
(*bmc)--;
if (*bmc <= 2)
set_page_attr(bitmap,
- filemap_get_page(bitmap, offset >> CHUNK_BLOCK_SHIFT(bitmap)),
+ filemap_get_page(
+ bitmap,
+ offset >> CHUNK_BLOCK_SHIFT(bitmap)),
BITMAP_PAGE_CLEAN);
spin_unlock_irqrestore(&bitmap->lock, flags);
@@ -1649,10 +1686,13 @@ int bitmap_create(mddev_t *mddev)
BUILD_BUG_ON(sizeof(bitmap_super_t) != 256);
- if (!file && !mddev->bitmap_info.offset) /* bitmap disabled, nothing to do */
+ if (!file
+ && !mddev->bitmap_info.offset
+ && !mddev->bitmap_info.log) /* bitmap disabled, nothing to do */
return 0;
BUG_ON(file && mddev->bitmap_info.offset);
+ BUG_ON(mddev->bitmap_info.offset && mddev->bitmap_info.log);
bitmap = kzalloc(sizeof(*bitmap), GFP_KERNEL);
if (!bitmap)
@@ -1730,7 +1770,17 @@ int bitmap_create(mddev_t *mddev)
|| bitmap->events_cleared == mddev->events)
/* no need to keep dirty bits to optimise a re-add of a missing device */
start = mddev->recovery_cp;
- err = bitmap_init_from_disk(bitmap, start);
+ if (mddev->bitmap_info.log) {
+ unsigned long i;
+ struct dm_dirty_log *log = mddev->bitmap_info.log;
+ for (i = 0; i < bitmap->chunks; i++)
+ if (!log->type->in_sync(log, i, 1))
+ bitmap_set_memory_bits(bitmap,
+ (sector_t)i << CHUNK_BLOCK_SHIFT(bitmap),
+ 1);
+ err = 0;
+ } else
+ err = bitmap_init_from_disk(bitmap, start);
if (err)
goto error;
diff --git a/drivers/md/bitmap.h b/drivers/md/bitmap.h
index 3797dea..a7a1113 100644
--- a/drivers/md/bitmap.h
+++ b/drivers/md/bitmap.h
@@ -222,6 +222,10 @@ struct bitmap {
unsigned long file_pages; /* number of pages in the file */
int last_page_size; /* bytes in the last page */
+ unsigned long logattrs; /* used when filemap_attr doesn't exist
+ * because we are working with a dirty_log
+ */
+
unsigned long flags;
int allclean;
@@ -243,6 +247,7 @@ struct bitmap {
wait_queue_head_t behind_wait;
struct sysfs_dirent *sysfs_can_clear;
+
};
/* the bitmap API */
diff --git a/drivers/md/md.h b/drivers/md/md.h
index a97e4b0..e97466f 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -314,6 +314,11 @@ struct mddev_s
* hot-adding a bitmap. It should
* eventually be settable by sysfs.
*/
+ /* When md is serving under dm, it might use a
+ * dirty_log to store the bits.
+ */
+ struct dm_dirty_log *log;
+
struct mutex mutex;
unsigned long chunksize;
unsigned long daemon_sleep; /* how many seconds between updates? */
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 24/24] dm-raid456: switch to use dm_dirty_log for tracking dirty regions.
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
` (17 preceding siblings ...)
2010-06-01 9:56 ` [PATCH 15/24] dm-raid456: add suspend/resume method NeilBrown
@ 2010-06-01 9:56 ` NeilBrown
2010-06-01 9:56 ` [PATCH 17/24] md/bitmap: white space clean up and similar NeilBrown
` (4 subsequent siblings)
23 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2010-06-01 9:56 UTC (permalink / raw)
To: Heinz Mauelshagen, Alasdair G Kergon; +Cc: linux-raid, dm-devel
Rather than faking a 'core' dirty log, we can now use the dm_dirty_log
mechanism directly and thus potentially benefit from a permanent dirty
log.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/dm-raid456.c | 46 +++++++++++++++++++++++++++++++++++++---------
drivers/md/md.h | 2 +-
2 files changed, 38 insertions(+), 10 deletions(-)
diff --git a/drivers/md/dm-raid456.c b/drivers/md/dm-raid456.c
index 3fda954..3dcbc4a 100644
--- a/drivers/md/dm-raid456.c
+++ b/drivers/md/dm-raid456.c
@@ -7,6 +7,8 @@
#include "md.h"
#include "raid5.h"
#include "dm.h"
+#include "bitmap.h"
+#include <linux/dm-dirty-log.h>
struct raid_dev {
struct dm_dev *dev;
@@ -183,7 +185,8 @@ static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv)
{
char *err = NULL;
int errnum = -EINVAL;
- unsigned long cnt;
+ unsigned long cnt, log_cnt;
+ char **log_argv;
struct raid_type *rt;
unsigned long chunk_size;
int recovery = 1;
@@ -192,16 +195,18 @@ static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv)
sector_t sectors_per_dev, chunks;
struct raid_set *rs = NULL;
int in_sync, i;
+ struct dm_dirty_log *log = NULL;
- /* log type - core XXX [no]sync */
+ /* log type - type arg-count args */
err = "Cannot parse log type";
if (argc < 2 ||
- strcmp(argv[0], "core") != 0 ||
strict_strtoul(argv[1], 10, &cnt) < 0 ||
cnt + 2 > argc)
goto err;
- if (cnt >= 2 && strcmp(argv[3], "sync") == 0)
- recovery = 0;
+
+ log_cnt = cnt;
+ log_argv = argv;
+
argc -= cnt+2;
argv += cnt+2;
@@ -276,6 +281,11 @@ static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv)
if (sector_div(chunks, chunk_size))
goto err;
+ log = dm_dirty_log_create(log_argv[0], ti, sectors_per_dev,
+ NULL, log_cnt, log_argv+2);
+ err = "Error creating dirty log";
+ if (!log)
+ goto err;
/* Now the devices: three words each */
rs = context_alloc(rt, chunk_size, recovery,
@@ -318,6 +328,11 @@ static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv)
ti->split_io = rs->md.chunk_sectors;
ti->private = rs;
+ rs->md.bitmap_info.log = log;
+ rs->md.bitmap_info.daemon_sleep = 10 * HZ;
+ rs->md.bitmap_info.chunksize = log->type->get_region_size(log) * 512;
+ rs->md.bitmap_info.external = 1;
+
mutex_lock(&rs->md.reconfig_mutex);
err = "Fail to run raid array";
errnum = md_run(&rs->md);
@@ -332,6 +347,8 @@ static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv)
return 0;
err:
+ if (log)
+ dm_dirty_log_destroy(log);
if (rs)
context_free(rs);
ti->error = err;
@@ -343,6 +360,7 @@ static void raid_dtr(struct dm_target *ti)
struct raid_set *rs = ti->private;
list_del_init(&rs->callbacks.list);
+ dm_dirty_log_destroy(rs->md.bitmap_info.log);
md_stop(&rs->md);
context_free(rs);
}
@@ -362,6 +380,7 @@ static int raid_status(struct dm_target *ti, status_type_t type,
{
struct raid_set *rs = ti->private;
struct raid5_private_data *conf = rs->md.private;
+ struct dm_dirty_log *log = conf->mddev->bitmap_info.log;
int sz = 0;
int rbcnt;
int i;
@@ -394,14 +413,14 @@ static int raid_status(struct dm_target *ti, status_type_t type,
DMEMIT("%llu/%llu ",
(unsigned long long) sync,
(unsigned long long) rs->md.resync_max_sectors);
- DMEMIT("1 core");
+
+ sz += log->type->status(log, type, result + sz, maxlen - sz);
break;
case STATUSTYPE_TABLE:
/* The string you would use to construct this array */
- /* Pretend to use a core log with a region size of 1 sector */
- DMEMIT("core 2 %u %ssync ", 1,
- rs->md.recovery_cp == MaxSector ? "" : "no");
+ sz += log->type->status(log, type, result + sz, maxlen - sz);
+
DMEMIT("%s ", rs->raid_type->name);
DMEMIT("1 %u ", rs->md.chunk_sectors);
@@ -469,19 +488,28 @@ static void raid_io_hints(struct dm_target *ti,
static void raid_presuspend(struct dm_target *ti)
{
struct raid_set *rs = ti->private;
+ struct dm_dirty_log *log = rs->md.bitmap_info.log;
+
md_stop_writes(&rs->md);
+ log->type->presuspend(log);
}
static void raid_postsuspend(struct dm_target *ti)
{
struct raid_set *rs = ti->private;
+ struct dm_dirty_log *log = rs->md.bitmap_info.log;
+
mddev_suspend(&rs->md);
+ log->type->postsuspend(log);
}
static void raid_resume(struct dm_target *ti)
{
struct raid_set *rs = ti->private;
+ struct dm_dirty_log *log = rs->md.bitmap_info.log;
+ log->type->resume(log);
+ bitmap_load(&rs->md);
mddev_resume(&rs->md);
}
diff --git a/drivers/md/md.h b/drivers/md/md.h
index e97466f..e53b355 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -321,7 +321,7 @@ struct mddev_s
struct mutex mutex;
unsigned long chunksize;
- unsigned long daemon_sleep; /* how many seconds between updates? */
+ unsigned long daemon_sleep; /* how many jiffies between updates? */
unsigned long max_write_behind; /* write-behind mode */
int external;
} bitmap_info;
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [dm-devel] [PATCH 21/24] dm-dirty-log: allow log size to be different from target size.
[not found] ` <1275490641.30896.40.camel@o>
@ 2010-06-03 0:10 ` Neil Brown
2010-06-03 0:53 ` Heinz Mauelshagen
0 siblings, 1 reply; 27+ messages in thread
From: Neil Brown @ 2010-06-03 0:10 UTC (permalink / raw)
To: heinzm, device-mapper development; +Cc: linux-raid
[Insert obligatory grumble about Reply-to headers inserted by
mail-list software. Adding linux-raid back to the cc list...]
On Wed, 02 Jun 2010 16:57:21 +0200
Heinz Mauelshagen <heinzm@redhat.com> wrote:
> Neil,
>
> I had a first review through your patch series, which look mostly good
> to me.
>
> I've got 2 points so far before I go for tests and further review:
>
> o there's actually no need to change the dm-dirty-log interface in an
> incompatible way to allow for what's needed (see patch attached on top
> of your series) which we can't do in RHEL/SUSE/... anyway.
Yes I know. I saw that work-around in your original patch.
However this patch set isn't aimed at RHEL or SLES, it is aimed upstream.
And when we do things upstream we do them *right*.
If we then want to backport them to RHEL/SLES which require ABI
compatibility, then little hacks like the one you show (i.e. lying about the
size of the target when creating the log) may be perfectly appropriate.
>
> Notwithstanding, we need a discussion on dm-devel to justify if we
> should change the API upstream in order to avoid such workaround as in
> my attached patched.
Certainly it is appropriate to discuss this API change - that is why I
highlighted it in my introduction to the patch set. Do you have an opinion
on it?
>
> o any reason you limit the dm-dirty-log type to 'core' ?
That is only in the earlier part of the patch set. md/raid5 can work without
a dirty log at all. When it does, it's behaviour is completely analogous to
working with a 'core' dirty log. So to get the raid5 part working without
needing to worry about the dirty-log stuff, I told md/raid5 not to use a
dirty log, and pretended it was using a 'core' log.
Then in later patches after I had enabled md/bitmap to work with dm-log, I
switched dm-raid45 over to use a real dm-log and removed the restriction to
only using a core log.
So this is really just an artefact of the order in which I developed the
code. I could go back and re-arrange it so the dm-log integration comes
first, then I wouldn't need this intermediate stage which only supports
'core'.
Thanks,
NeilBrown
>
> Heinz
>
> On Tue, 2010-06-01 at 19:56 +1000, NeilBrown wrote:
> > With RAID1, the dirty log covers the size of the target - the number
> > of bits is the target size divided by the region size.
> >
> > For RAID5 and similar the dirty log needs to cover the parity blocks,
> > so it is best to base the dirty log size on the size of the component
> > devices rather than the size of the array.
> >
> > So when creating a dirty log, allow the log_size to be specified
> > separately from the target, and in raid1, set it to the target length.
> >
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> > drivers/md/dm-log-userspace-base.c | 11 +++++++----
> > drivers/md/dm-log.c | 18 +++++++++++-------
> > drivers/md/dm-raid1.c | 4 ++--
> > include/linux/dm-dirty-log.h | 3 ++-
> > 4 files changed, 22 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/md/dm-log-userspace-base.c b/drivers/md/dm-log-userspace-base.c
> > index 1ed0094..935a49b 100644
> > --- a/drivers/md/dm-log-userspace-base.c
> > +++ b/drivers/md/dm-log-userspace-base.c
> > @@ -94,7 +94,7 @@ retry:
> <SNIP>
>
>
> drivers/md/dm-log.c | 18 +++++++-----------
> drivers/md/dm-raid1.c | 4 ++--
> drivers/md/dm-raid456.c | 8 +++++---
> include/linux/dm-dirty-log.h | 3 +--
> 4 files changed, 15 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
> index a232c14..5a08be0 100644
> --- a/drivers/md/dm-log.c
> +++ b/drivers/md/dm-log.c
> @@ -146,7 +146,6 @@ EXPORT_SYMBOL(dm_dirty_log_type_unregister);
>
> struct dm_dirty_log *dm_dirty_log_create(const char *type_name,
> struct dm_target *ti,
> - sector_t log_size,
> int (*flush_callback_fn)(struct dm_target *ti),
> unsigned int argc, char **argv)
> {
> @@ -165,7 +164,7 @@ struct dm_dirty_log *dm_dirty_log_create(const char *type_name,
>
> log->flush_callback_fn = flush_callback_fn;
> log->type = type;
> - if (type->ctr(log, ti, log_size, argc, argv)) {
> + if (type->ctr(log, ti, argc, argv)) {
> kfree(log);
> put_type(type);
> return NULL;
> @@ -336,9 +335,9 @@ static int read_header(struct log_c *log)
> return 0;
> }
>
> -static int _check_region_size(sector_t log_size, uint32_t region_size)
> +static int _check_region_size(struct dm_target *ti, uint32_t region_size)
> {
> - if (region_size < 2 || region_size > log_size)
> + if (region_size < 2 || region_size > ti->len)
> return 0;
>
> if (!is_power_of_2(region_size))
> @@ -354,7 +353,6 @@ static int _check_region_size(sector_t log_size, uint32_t region_size)
> *--------------------------------------------------------------*/
> #define BYTE_SHIFT 3
> static int create_log_context(struct dm_dirty_log *log, struct dm_target *ti,
> - sector_t log_size,
> unsigned int argc, char **argv,
> struct dm_dev *dev)
> {
> @@ -384,12 +382,12 @@ static int create_log_context(struct dm_dirty_log *log, struct dm_target *ti,
> }
>
> if (sscanf(argv[0], "%u", ®ion_size) != 1 ||
> - !_check_region_size(log_size, region_size)) {
> + !_check_region_size(ti, region_size)) {
> DMWARN("invalid region size %s", argv[0]);
> return -EINVAL;
> }
>
> - region_count = dm_sector_div_up(log_size, region_size);
> + region_count = dm_sector_div_up(ti->len, region_size);
>
> lc = kmalloc(sizeof(*lc), GFP_KERNEL);
> if (!lc) {
> @@ -509,10 +507,9 @@ static int create_log_context(struct dm_dirty_log *log, struct dm_target *ti,
> }
>
> static int core_ctr(struct dm_dirty_log *log, struct dm_target *ti,
> - sector_t log_size,
> unsigned int argc, char **argv)
> {
> - return create_log_context(log, ti, log_size, argc, argv, NULL);
> + return create_log_context(log, ti, argc, argv, NULL);
> }
>
> static void destroy_log_context(struct log_c *lc)
> @@ -536,7 +533,6 @@ static void core_dtr(struct dm_dirty_log *log)
> * argv contains log_device region_size followed optionally by [no]sync
> *--------------------------------------------------------------*/
> static int disk_ctr(struct dm_dirty_log *log, struct dm_target *ti,
> - sector_t log_size,
> unsigned int argc, char **argv)
> {
> int r;
> @@ -551,7 +547,7 @@ static int disk_ctr(struct dm_dirty_log *log, struct dm_target *ti,
> if (r)
> return r;
>
> - r = create_log_context(log, ti, log_size, argc - 1, argv + 1, dev);
> + r = create_log_context(log, ti, argc - 1, argv + 1, dev);
> if (r) {
> dm_put_device(ti, dev);
> return r;
> diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
> index ea732fc..ddda531 100644
> --- a/drivers/md/dm-raid1.c
> +++ b/drivers/md/dm-raid1.c
> @@ -968,8 +968,8 @@ static struct dm_dirty_log *create_dirty_log(struct dm_target *ti,
> return NULL;
> }
>
> - dl = dm_dirty_log_create(argv[0], ti, ti->len, mirror_flush,
> - param_count, argv + 2);
> + dl = dm_dirty_log_create(argv[0], ti, mirror_flush, param_count,
> + argv + 2);
> if (!dl) {
> ti->error = "Error creating mirror dirty log";
> return NULL;
> diff --git a/drivers/md/dm-raid456.c b/drivers/md/dm-raid456.c
> index 3dcbc4a..33d2be8 100644
> --- a/drivers/md/dm-raid456.c
> +++ b/drivers/md/dm-raid456.c
> @@ -192,7 +192,7 @@ static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv)
> int recovery = 1;
> long raid_devs;
> long rebuildA, rebuildB;
> - sector_t sectors_per_dev, chunks;
> + sector_t sectors_per_dev, chunks, ti_len_sav;
> struct raid_set *rs = NULL;
> int in_sync, i;
> struct dm_dirty_log *log = NULL;
> @@ -281,8 +281,10 @@ static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv)
> if (sector_div(chunks, chunk_size))
> goto err;
>
> - log = dm_dirty_log_create(log_argv[0], ti, sectors_per_dev,
> - NULL, log_cnt, log_argv+2);
> + ti_len_sav = ti->len;
> + ti->len = sectors_per_dev;
> + log = dm_dirty_log_create(log_argv[0], ti, NULL, log_cnt, log_argv+2);
> + ti->len = ti_len_sav;
> err = "Error creating dirty log";
> if (!log)
> goto err;
> diff --git a/include/linux/dm-dirty-log.h b/include/linux/dm-dirty-log.h
> index 641419f..7084503 100644
> --- a/include/linux/dm-dirty-log.h
> +++ b/include/linux/dm-dirty-log.h
> @@ -33,7 +33,6 @@ struct dm_dirty_log_type {
> struct list_head list;
>
> int (*ctr)(struct dm_dirty_log *log, struct dm_target *ti,
> - sector_t log_size,
> unsigned argc, char **argv);
> void (*dtr)(struct dm_dirty_log *log);
>
> @@ -138,7 +137,7 @@ int dm_dirty_log_type_unregister(struct dm_dirty_log_type *type);
> * type->constructor/destructor() directly.
> */
> struct dm_dirty_log *dm_dirty_log_create(const char *type_name,
> - struct dm_target *ti, sector_t log_size,
> + struct dm_target *ti,
> int (*flush_callback_fn)(struct dm_target *ti),
> unsigned argc, char **argv);
> void dm_dirty_log_destroy(struct dm_dirty_log *log);
>
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [dm-devel] [PATCH 21/24] dm-dirty-log: allow log size to be different from target size.
2010-06-03 0:10 ` [dm-devel] " Neil Brown
@ 2010-06-03 0:53 ` Heinz Mauelshagen
0 siblings, 0 replies; 27+ messages in thread
From: Heinz Mauelshagen @ 2010-06-03 0:53 UTC (permalink / raw)
To: Neil Brown; +Cc: device-mapper development, linux-raid
On Thu, 2010-06-03 at 10:10 +1000, Neil Brown wrote:
> [Insert obligatory grumble about Reply-to headers inserted by
> mail-list software. Adding linux-raid back to the cc list...]
>
> On Wed, 02 Jun 2010 16:57:21 +0200
> Heinz Mauelshagen <heinzm@redhat.com> wrote:
>
> > Neil,
> >
> > I had a first review through your patch series, which look mostly good
> > to me.
> >
> > I've got 2 points so far before I go for tests and further review:
> >
> > o there's actually no need to change the dm-dirty-log interface in an
> > incompatible way to allow for what's needed (see patch attached on top
> > of your series) which we can't do in RHEL/SUSE/... anyway.
>
> Yes I know. I saw that work-around in your original patch.
> However this patch set isn't aimed at RHEL or SLES, it is aimed upstream.
> And when we do things upstream we do them *right*.
> If we then want to backport them to RHEL/SLES which require ABI
> compatibility, then little hacks like the one you show (i.e. lying about the
> size of the target when creating the log) may be perfectly appropriate.
Ok, we're in agreement then WRT ABI/API compatibility requirements in
distribution main releases as opposed to flexibility in upstream.
>
> >
> > Notwithstanding, we need a discussion on dm-devel to justify if we
> > should change the API upstream in order to avoid such workaround as in
> > my attached patched.
>
> Certainly it is appropriate to discuss this API change - that is why I
> highlighted it in my introduction to the patch set. Do you have an opinion
> on it?
>
I'd like to avoid it altogether in order to avoid multiple code bases
wherever possible.
>
> >
> > o any reason you limit the dm-dirty-log type to 'core' ?
>
> That is only in the earlier part of the patch set. md/raid5 can work without
> a dirty log at all. When it does, it's behaviour is completely analogous to
> working with a 'core' dirty log. So to get the raid5 part working without
> needing to worry about the dirty-log stuff, I told md/raid5 not to use a
> dirty log, and pretended it was using a 'core' log.
Yes, I meanwhile saw it. Overlooked it initially.
>
> Then in later patches after I had enabled md/bitmap to work with dm-log, I
> switched dm-raid45 over to use a real dm-log and removed the restriction to
> only using a core log.
> So this is really just an artefact of the order in which I developed the
> code. I could go back and re-arrange it so the dm-log integration comes
> first, then I wouldn't need this intermediate stage which only supports
> 'core'.
No need to; nicely illustrates the development history.
Heinz
>
> Thanks,
> NeilBrown
>
>
> >
> > Heinz
> >
> > On Tue, 2010-06-01 at 19:56 +1000, NeilBrown wrote:
> > > With RAID1, the dirty log covers the size of the target - the number
> > > of bits is the target size divided by the region size.
> > >
> > > For RAID5 and similar the dirty log needs to cover the parity blocks,
> > > so it is best to base the dirty log size on the size of the component
> > > devices rather than the size of the array.
> > >
> > > So when creating a dirty log, allow the log_size to be specified
> > > separately from the target, and in raid1, set it to the target length.
> > >
> > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > ---
> > > drivers/md/dm-log-userspace-base.c | 11 +++++++----
> > > drivers/md/dm-log.c | 18 +++++++++++-------
> > > drivers/md/dm-raid1.c | 4 ++--
> > > include/linux/dm-dirty-log.h | 3 ++-
> > > 4 files changed, 22 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/drivers/md/dm-log-userspace-base.c b/drivers/md/dm-log-userspace-base.c
> > > index 1ed0094..935a49b 100644
> > > --- a/drivers/md/dm-log-userspace-base.c
> > > +++ b/drivers/md/dm-log-userspace-base.c
> > > @@ -94,7 +94,7 @@ retry:
> > <SNIP>
> >
> >
> > drivers/md/dm-log.c | 18 +++++++-----------
> > drivers/md/dm-raid1.c | 4 ++--
> > drivers/md/dm-raid456.c | 8 +++++---
> > include/linux/dm-dirty-log.h | 3 +--
> > 4 files changed, 15 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
> > index a232c14..5a08be0 100644
> > --- a/drivers/md/dm-log.c
> > +++ b/drivers/md/dm-log.c
> > @@ -146,7 +146,6 @@ EXPORT_SYMBOL(dm_dirty_log_type_unregister);
> >
> > struct dm_dirty_log *dm_dirty_log_create(const char *type_name,
> > struct dm_target *ti,
> > - sector_t log_size,
> > int (*flush_callback_fn)(struct dm_target *ti),
> > unsigned int argc, char **argv)
> > {
> > @@ -165,7 +164,7 @@ struct dm_dirty_log *dm_dirty_log_create(const char *type_name,
> >
> > log->flush_callback_fn = flush_callback_fn;
> > log->type = type;
> > - if (type->ctr(log, ti, log_size, argc, argv)) {
> > + if (type->ctr(log, ti, argc, argv)) {
> > kfree(log);
> > put_type(type);
> > return NULL;
> > @@ -336,9 +335,9 @@ static int read_header(struct log_c *log)
> > return 0;
> > }
> >
> > -static int _check_region_size(sector_t log_size, uint32_t region_size)
> > +static int _check_region_size(struct dm_target *ti, uint32_t region_size)
> > {
> > - if (region_size < 2 || region_size > log_size)
> > + if (region_size < 2 || region_size > ti->len)
> > return 0;
> >
> > if (!is_power_of_2(region_size))
> > @@ -354,7 +353,6 @@ static int _check_region_size(sector_t log_size, uint32_t region_size)
> > *--------------------------------------------------------------*/
> > #define BYTE_SHIFT 3
> > static int create_log_context(struct dm_dirty_log *log, struct dm_target *ti,
> > - sector_t log_size,
> > unsigned int argc, char **argv,
> > struct dm_dev *dev)
> > {
> > @@ -384,12 +382,12 @@ static int create_log_context(struct dm_dirty_log *log, struct dm_target *ti,
> > }
> >
> > if (sscanf(argv[0], "%u", ®ion_size) != 1 ||
> > - !_check_region_size(log_size, region_size)) {
> > + !_check_region_size(ti, region_size)) {
> > DMWARN("invalid region size %s", argv[0]);
> > return -EINVAL;
> > }
> >
> > - region_count = dm_sector_div_up(log_size, region_size);
> > + region_count = dm_sector_div_up(ti->len, region_size);
> >
> > lc = kmalloc(sizeof(*lc), GFP_KERNEL);
> > if (!lc) {
> > @@ -509,10 +507,9 @@ static int create_log_context(struct dm_dirty_log *log, struct dm_target *ti,
> > }
> >
> > static int core_ctr(struct dm_dirty_log *log, struct dm_target *ti,
> > - sector_t log_size,
> > unsigned int argc, char **argv)
> > {
> > - return create_log_context(log, ti, log_size, argc, argv, NULL);
> > + return create_log_context(log, ti, argc, argv, NULL);
> > }
> >
> > static void destroy_log_context(struct log_c *lc)
> > @@ -536,7 +533,6 @@ static void core_dtr(struct dm_dirty_log *log)
> > * argv contains log_device region_size followed optionally by [no]sync
> > *--------------------------------------------------------------*/
> > static int disk_ctr(struct dm_dirty_log *log, struct dm_target *ti,
> > - sector_t log_size,
> > unsigned int argc, char **argv)
> > {
> > int r;
> > @@ -551,7 +547,7 @@ static int disk_ctr(struct dm_dirty_log *log, struct dm_target *ti,
> > if (r)
> > return r;
> >
> > - r = create_log_context(log, ti, log_size, argc - 1, argv + 1, dev);
> > + r = create_log_context(log, ti, argc - 1, argv + 1, dev);
> > if (r) {
> > dm_put_device(ti, dev);
> > return r;
> > diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
> > index ea732fc..ddda531 100644
> > --- a/drivers/md/dm-raid1.c
> > +++ b/drivers/md/dm-raid1.c
> > @@ -968,8 +968,8 @@ static struct dm_dirty_log *create_dirty_log(struct dm_target *ti,
> > return NULL;
> > }
> >
> > - dl = dm_dirty_log_create(argv[0], ti, ti->len, mirror_flush,
> > - param_count, argv + 2);
> > + dl = dm_dirty_log_create(argv[0], ti, mirror_flush, param_count,
> > + argv + 2);
> > if (!dl) {
> > ti->error = "Error creating mirror dirty log";
> > return NULL;
> > diff --git a/drivers/md/dm-raid456.c b/drivers/md/dm-raid456.c
> > index 3dcbc4a..33d2be8 100644
> > --- a/drivers/md/dm-raid456.c
> > +++ b/drivers/md/dm-raid456.c
> > @@ -192,7 +192,7 @@ static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv)
> > int recovery = 1;
> > long raid_devs;
> > long rebuildA, rebuildB;
> > - sector_t sectors_per_dev, chunks;
> > + sector_t sectors_per_dev, chunks, ti_len_sav;
> > struct raid_set *rs = NULL;
> > int in_sync, i;
> > struct dm_dirty_log *log = NULL;
> > @@ -281,8 +281,10 @@ static int raid_ctr(struct dm_target *ti, unsigned argc, char **argv)
> > if (sector_div(chunks, chunk_size))
> > goto err;
> >
> > - log = dm_dirty_log_create(log_argv[0], ti, sectors_per_dev,
> > - NULL, log_cnt, log_argv+2);
> > + ti_len_sav = ti->len;
> > + ti->len = sectors_per_dev;
> > + log = dm_dirty_log_create(log_argv[0], ti, NULL, log_cnt, log_argv+2);
> > + ti->len = ti_len_sav;
> > err = "Error creating dirty log";
> > if (!log)
> > goto err;
> > diff --git a/include/linux/dm-dirty-log.h b/include/linux/dm-dirty-log.h
> > index 641419f..7084503 100644
> > --- a/include/linux/dm-dirty-log.h
> > +++ b/include/linux/dm-dirty-log.h
> > @@ -33,7 +33,6 @@ struct dm_dirty_log_type {
> > struct list_head list;
> >
> > int (*ctr)(struct dm_dirty_log *log, struct dm_target *ti,
> > - sector_t log_size,
> > unsigned argc, char **argv);
> > void (*dtr)(struct dm_dirty_log *log);
> >
> > @@ -138,7 +137,7 @@ int dm_dirty_log_type_unregister(struct dm_dirty_log_type *type);
> > * type->constructor/destructor() directly.
> > */
> > struct dm_dirty_log *dm_dirty_log_create(const char *type_name,
> > - struct dm_target *ti, sector_t log_size,
> > + struct dm_target *ti,
> > int (*flush_callback_fn)(struct dm_target *ti),
> > unsigned argc, char **argv);
> > void dm_dirty_log_destroy(struct dm_dirty_log *log);
> >
> >
> > --
> > dm-devel mailing list
> > dm-devel@redhat.com
> > https://www.redhat.com/mailman/listinfo/dm-devel
>
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2010-06-03 0:53 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-01 9:56 [PATCH 00/24] dm-raid456 support using md/raid5.c, now with dirty-log NeilBrown
2010-06-01 9:56 ` [PATCH 01/24] md: reduce dependence on sysfs NeilBrown
2010-06-01 9:56 ` [PATCH 03/24] md/raid5: ensure we create a unique name for kmem_cache when mddev has no gendisk NeilBrown
2010-06-01 9:56 ` [PATCH 02/24] md/raid5: factor out code for changing size of stripe cache NeilBrown
2010-06-01 9:56 ` [PATCH 08/24] dm-raid456: add support for raising events to userspace NeilBrown
2010-06-01 9:56 ` [PATCH 06/24] md: export various start/stop interfaces NeilBrown
2010-06-01 9:56 ` [PATCH 14/24] dm-raid456: add support for setting IO hints NeilBrown
2010-06-01 9:56 ` [PATCH 10/24] dm-raid456: add congestion checking NeilBrown
2010-06-01 9:56 ` [PATCH 04/24] md: be more careful setting MD_CHANGE_CLEAN NeilBrown
2010-06-01 9:56 ` [PATCH 05/24] md: split out md_rdev_init NeilBrown
2010-06-01 9:56 ` [PATCH 12/24] md/plug: optionally use plugger to unplug an array during resync/recovery NeilBrown
2010-06-01 9:56 ` [PATCH 07/24] md/dm: create dm-raid456 module using md/raid5 NeilBrown
2010-06-01 9:56 ` [PATCH 11/24] md/raid5: add simple plugging infrastructure NeilBrown
2010-06-01 9:56 ` [PATCH 13/24] dm-raid456: support unplug NeilBrown
2010-06-01 9:56 ` [PATCH 09/24] raid5: Don't set read-ahead when there is no queue NeilBrown
2010-06-01 9:56 ` [PATCH 19/24] md/bitmap: clean up plugging calls NeilBrown
2010-06-01 9:56 ` [PATCH 20/24] md/bitmap: optimise scanning of empty bitmaps NeilBrown
2010-06-01 9:56 ` [PATCH 22/24] md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log NeilBrown
2010-06-01 9:56 ` [PATCH 15/24] dm-raid456: add suspend/resume method NeilBrown
2010-06-01 9:56 ` [PATCH 24/24] dm-raid456: switch to use dm_dirty_log for tracking dirty regions NeilBrown
2010-06-01 9:56 ` [PATCH 17/24] md/bitmap: white space clean up and similar NeilBrown
2010-06-01 9:56 ` [PATCH 18/24] md/bitmap: reduce dependence on sysfs NeilBrown
2010-06-01 9:56 ` [PATCH 23/24] md/bitmap: separate out loading a bitmap from initialising the structures NeilBrown
2010-06-01 9:56 ` [PATCH 16/24] dm-raid456: add message handler NeilBrown
2010-06-01 9:56 ` [PATCH 21/24] dm-dirty-log: allow log size to be different from target size NeilBrown
[not found] ` <1275490641.30896.40.camel@o>
2010-06-03 0:10 ` [dm-devel] " Neil Brown
2010-06-03 0:53 ` Heinz Mauelshagen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).