* DDF / RAID10 patch series for mdadm
@ 2013-03-01 22:28 mwilck
2013-03-01 22:28 ` [PATCH 01/12] DDF: cleanly save the secondary DDF structure mwilck
` (12 more replies)
0 siblings, 13 replies; 16+ messages in thread
From: mwilck @ 2013-03-01 22:28 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: mwilck
Hello Neil, hello everybody,
I am working on improved DDF support in mdadm. I would be grateful
for a review on this patch series. The main new feature is support for
RAID 10 in DDF, such that md is able to interoperate with an LSI
Megaraid Software RAID driver on a RAID10 array.
The DDF/RAID10 support is not feature complete, but usable. Both
assembly and incremental assembly work as expected. I verified that
the 2+2 layout maps cleanly to that used by the LSI SW RAID stack. Meta
data are saved cleanly and interpreted correctly by the LSI stack.
What is yet missing is code to handle disk failures and additions.
I would appreciate some guidance in that area, in particular how to test it.
In the DDF Spec, RAID10 is a special case of the broad "secondary RAID
level" concept. The intersection between DDF RAID 10 and md RAID 10 is
tiny - just the "near" layout with an even number of disks. Fortunately this
is also the only type of "secondary RAID" supported by my LSI stack,
and generally the most common multilevel RAID, AFAICT.
The DDF 2nd level RAID concept would be matched more closely by a
different approach where md only creates the BVDs and the secondary
RAID is implemented with the device mapper on top of it. But I opted
for the direct mapping of DDF RAID10 to md RAID10 for several reasons:
1) dm wouldn't handle the generic "striped" and "spanned" secondary
levels out-of-the box, either. 2) I wanted to take full benefit of the
advanced features and performance of md RAID10 (it actually performed
better than the LSI stack in a few simple benchmarks). 3) It is
simpler to handle this way. 4) This is non-exclusive, code could still
be added later to use dm for more "exotic" secondary DDF RAID levels.
The series stars out with 3 patches I already submitted a while
ago. They fix interoperability related to meta data handling, in particular
DDF header locations and Sequential numbers. They are unrelated to RAID10.
Patch 4..9 introduce support for DDF RAID 10. The current mdadm
implementation of DDF ignores secondary level completely and stores
only the BVD information in the super block, one BVD for every DDF
"Virtual disk configuration record". I had to add an additional data
structure to hold information about the "other BVDs" belonging to a
second level RAID set. The most intersting stuff happens in
container_content_ddf, which translates the DDF RAID 10 setup into md
structures.
Patch 10 adds some missing sanity checks in compare_super_ddf. Patch 11
extends compare_super_ddf() such that information from the new disk that
is missing in the current superblock is added. This fixes a problem where
mdmon would save updated meta data only on those disks that were present
while mdadm was started, not on disks added later.
Finally, patch 12 is a small improvement to Detail() that results in
better output of mdadm -D for complex DDF setups.
Regards
Martin
Martin Wilck (12):
DDF: cleanly save the secondary DDF structure
DDF: use existing locations for primary and secondary DDF structure
DDF: increase seq number when writing meta data
DDF: added other_bvd to struct vcl
DDF: load_ddf_local: store VD conf for other BVDs
DDF: container_content_ddf: change array disk search loop
DDF: container_content_ddf: check for secondary RAID
DDF: container_content_ddf: handle RAID layout for RAID10
DDF: __write_init_super_ddf: use correct VD conf
DDF: add sanity checks in compare_super_ddf
DDF: compare_super_ddf: merge local info of other superblock
Detail.c: call load_container for container subarrays
Detail.c | 10 +-
super-ddf.c | 556 ++++++++++++++++++++++++++++++++++++++++++++++++++---------
2 files changed, 484 insertions(+), 82 deletions(-)
--
1.7.3.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 01/12] DDF: cleanly save the secondary DDF structure
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
@ 2013-03-01 22:28 ` mwilck
2013-03-01 22:28 ` [PATCH 02/12] DDF: use existing locations for primary and " mwilck
` (11 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: mwilck @ 2013-03-01 22:28 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: mwilck
So far, mdadm only saved the header of the secondary structure.
With this patch, the full secondary DDF structure is saved
consistently, too. Some vendor DDF implementations need it.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
---
super-ddf.c | 139 ++++++++++++++++++++++++++++++++++-------------------------
1 files changed, 80 insertions(+), 59 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index 3b3c1f0..7f943cb 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -2317,17 +2317,87 @@ static int remove_from_super_ddf(struct supertype *st, mdu_disk_info_t *dk)
*/
#define NULL_CONF_SZ 4096
-static int __write_init_super_ddf(struct supertype *st)
+static int __write_ddf_structure(struct dl *d, struct ddf_super *ddf, __u8 type,
+ char *null_aligned)
{
+ unsigned long long sector;
+ struct ddf_header *header;
+ int fd, i, n_config, conf_size;
+
+ fd = d->fd;
+
+ switch (type) {
+ case DDF_HEADER_PRIMARY:
+ header = &ddf->primary;
+ sector = __be64_to_cpu(header->primary_lba);
+ break;
+ case DDF_HEADER_SECONDARY:
+ header = &ddf->secondary;
+ sector = __be64_to_cpu(header->secondary_lba);
+ break;
+ default:
+ return 0;
+ }
+
+ header->type = type;
+ header->openflag = 0;
+ header->crc = calc_crc(header, 512);
+
+ lseek64(fd, sector<<9, 0);
+ if (write(fd, header, 512) < 0)
+ return 0;
+
+ ddf->controller.crc = calc_crc(&ddf->controller, 512);
+ if (write(fd, &ddf->controller, 512) < 0)
+ return 0;
+ ddf->phys->crc = calc_crc(ddf->phys, ddf->pdsize);
+ if (write(fd, ddf->phys, ddf->pdsize) < 0)
+ return 0;
+ ddf->virt->crc = calc_crc(ddf->virt, ddf->vdsize);
+ if (write(fd, ddf->virt, ddf->vdsize) < 0)
+ return 0;
+
+ /* Now write lots of config records. */
+ n_config = ddf->max_part;
+ conf_size = ddf->conf_rec_len * 512;
+ for (i = 0 ; i <= n_config ; i++) {
+ struct vcl *c = d->vlist[i];
+ if (i == n_config)
+ c = (struct vcl *)d->spare;
+
+ if (c) {
+ c->conf.crc = calc_crc(&c->conf, conf_size);
+ if (write(fd, &c->conf, conf_size) < 0)
+ break;
+ } else {
+ unsigned int togo = conf_size;
+ while (togo > NULL_CONF_SZ) {
+ if (write(fd, null_aligned, NULL_CONF_SZ) < 0)
+ break;
+ togo -= NULL_CONF_SZ;
+ }
+ if (write(fd, null_aligned, togo) < 0)
+ break;
+ }
+ }
+ if (i <= n_config)
+ return 0;
+
+ d->disk.crc = calc_crc(&d->disk, 512);
+ if (write(fd, &d->disk, 512) < 0)
+ return 0;
+
+ return 1;
+}
+
+static int __write_init_super_ddf(struct supertype *st)
+{
struct ddf_super *ddf = st->sb;
- int i;
struct dl *d;
- int n_config;
- int conf_size;
int attempts = 0;
int successes = 0;
- unsigned long long size, sector;
+ unsigned long long size;
char *null_aligned;
if (posix_memalign((void**)&null_aligned, 4096, NULL_CONF_SZ) != 0) {
@@ -2355,6 +2425,7 @@ static int __write_init_super_ddf(struct supertype *st)
size /= 512;
ddf->anchor.workspace_lba = __cpu_to_be64(size - 32*1024*2);
ddf->anchor.primary_lba = __cpu_to_be64(size - 16*1024*2);
+ ddf->anchor.secondary_lba = __cpu_to_be64(size - 31*1024*2);
ddf->anchor.seq = __cpu_to_be32(1);
memcpy(&ddf->primary, &ddf->anchor, 512);
memcpy(&ddf->secondary, &ddf->anchor, 512);
@@ -2363,64 +2434,14 @@ static int __write_init_super_ddf(struct supertype *st)
ddf->anchor.seq = 0xFFFFFFFF; /* no sequencing in anchor */
ddf->anchor.crc = calc_crc(&ddf->anchor, 512);
- ddf->primary.openflag = 0;
- ddf->primary.type = DDF_HEADER_PRIMARY;
-
- ddf->secondary.openflag = 0;
- ddf->secondary.type = DDF_HEADER_SECONDARY;
-
- ddf->primary.crc = calc_crc(&ddf->primary, 512);
- ddf->secondary.crc = calc_crc(&ddf->secondary, 512);
-
- sector = size - 16*1024*2;
- lseek64(fd, sector<<9, 0);
- if (write(fd, &ddf->primary, 512) < 0)
- continue;
-
- ddf->controller.crc = calc_crc(&ddf->controller, 512);
- if (write(fd, &ddf->controller, 512) < 0)
+ if (!__write_ddf_structure(d, ddf, DDF_HEADER_PRIMARY,
+ null_aligned))
continue;
- ddf->phys->crc = calc_crc(ddf->phys, ddf->pdsize);
-
- if (write(fd, ddf->phys, ddf->pdsize) < 0)
- continue;
-
- ddf->virt->crc = calc_crc(ddf->virt, ddf->vdsize);
- if (write(fd, ddf->virt, ddf->vdsize) < 0)
+ if (!__write_ddf_structure(d, ddf, DDF_HEADER_SECONDARY,
+ null_aligned))
continue;
- /* Now write lots of config records. */
- n_config = ddf->max_part;
- conf_size = ddf->conf_rec_len * 512;
- for (i = 0 ; i <= n_config ; i++) {
- struct vcl *c = d->vlist[i];
- if (i == n_config)
- c = (struct vcl*)d->spare;
-
- if (c) {
- c->conf.crc = calc_crc(&c->conf, conf_size);
- if (write(fd, &c->conf, conf_size) < 0)
- break;
- } else {
- unsigned int togo = conf_size;
- while (togo > NULL_CONF_SZ) {
- if (write(fd, null_aligned, NULL_CONF_SZ) < 0)
- break;
- togo -= NULL_CONF_SZ;
- }
- if (write(fd, null_aligned, togo) < 0)
- break;
- }
- }
- if (i <= n_config)
- continue;
- d->disk.crc = calc_crc(&d->disk, 512);
- if (write(fd, &d->disk, 512) < 0)
- continue;
-
- /* Maybe do the same for secondary */
-
lseek64(fd, (size-1)*512, SEEK_SET);
if (write(fd, &ddf->anchor, 512) < 0)
continue;
--
1.7.3.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 02/12] DDF: use existing locations for primary and secondary DDF structure
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
2013-03-01 22:28 ` [PATCH 01/12] DDF: cleanly save the secondary DDF structure mwilck
@ 2013-03-01 22:28 ` mwilck
2013-03-01 22:28 ` [PATCH 03/12] DDF: increase seq number when writing meta data mwilck
` (10 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: mwilck @ 2013-03-01 22:28 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: mwilck
Some RAID BIOSes apparently use hard-coded LBA offsets (presumably
from the end of the disk) for the primary and secondary DDF
structure, ignoring the values given in the DDF anchor. This is
broken BIOS behavior, but it will cause any changes made by MD
(e.g. setting the init_state flag after a full initialization)
to be "forgotten" after the next reboot.
This patch fixes this by using the exiting LBA locations if
available. Verified that this fixes MD+LSI Mega Software RAID
BIOS.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
---
super-ddf.c | 28 +++++++++++++++++++++++++---
1 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index 7f943cb..2f75fc3 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -421,6 +421,9 @@ struct ddf_super {
char *devname;
int fd;
unsigned long long size; /* sectors */
+ unsigned long long primary_lba; /* sectors */
+ unsigned long long secondary_lba; /* sectors */
+ unsigned long long workspace_lba; /* sectors */
int pdnum; /* index in ->phys */
struct spare_assign *spare;
void *mdupdate; /* hold metadata update */
@@ -668,6 +671,13 @@ static int load_ddf_local(int fd, struct ddf_super *super,
dl->size = 0;
if (get_dev_size(fd, devname, &dsize))
dl->size = dsize >> 9;
+ /* If the disks have different sizes, the LBAs will differ
+ between phys disks.
+ At this point here, the values in super->active must be valid
+ for this phys disk. */
+ dl->primary_lba = super->active->primary_lba;
+ dl->secondary_lba = super->active->secondary_lba;
+ dl->workspace_lba = super->active->workspace_lba;
dl->spare = NULL;
for (i = 0 ; i < super->max_part ; i++)
dl->vlist[i] = NULL;
@@ -2423,9 +2433,21 @@ static int __write_init_super_ddf(struct supertype *st)
*/
get_dev_size(fd, NULL, &size);
size /= 512;
- ddf->anchor.workspace_lba = __cpu_to_be64(size - 32*1024*2);
- ddf->anchor.primary_lba = __cpu_to_be64(size - 16*1024*2);
- ddf->anchor.secondary_lba = __cpu_to_be64(size - 31*1024*2);
+ if (d->workspace_lba != 0)
+ ddf->anchor.workspace_lba = d->workspace_lba;
+ else
+ ddf->anchor.workspace_lba =
+ __cpu_to_be64(size - 32*1024*2);
+ if (d->primary_lba != 0)
+ ddf->anchor.primary_lba = d->primary_lba;
+ else
+ ddf->anchor.primary_lba =
+ __cpu_to_be64(size - 16*1024*2);
+ if (d->secondary_lba != 0)
+ ddf->anchor.secondary_lba = d->secondary_lba;
+ else
+ ddf->anchor.secondary_lba =
+ __cpu_to_be64(size - 32*1024*2);
ddf->anchor.seq = __cpu_to_be32(1);
memcpy(&ddf->primary, &ddf->anchor, 512);
memcpy(&ddf->secondary, &ddf->anchor, 512);
--
1.7.3.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 03/12] DDF: increase seq number when writing meta data
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
2013-03-01 22:28 ` [PATCH 01/12] DDF: cleanly save the secondary DDF structure mwilck
2013-03-01 22:28 ` [PATCH 02/12] DDF: use existing locations for primary and " mwilck
@ 2013-03-01 22:28 ` mwilck
2013-03-01 22:28 ` [PATCH 04/12] DDF: added other_bvd to struct vcl mwilck
` (9 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: mwilck @ 2013-03-01 22:28 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: mwilck
Cleanly increase the seq number when the DDF structures are
written, instead of always setting it back to 1.
Also, make sure that the sequential number of all headers and
VD conf records is the same.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
---
super-ddf.c | 11 ++++++++++-
1 files changed, 10 insertions(+), 1 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index 2f75fc3..e165927 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -2377,6 +2377,7 @@ static int __write_ddf_structure(struct dl *d, struct ddf_super *ddf, __u8 type,
c = (struct vcl *)d->spare;
if (c) {
+ c->conf.seqnum = ddf->primary.seq;
c->conf.crc = calc_crc(&c->conf, conf_size);
if (write(fd, &c->conf, conf_size) < 0)
break;
@@ -2409,12 +2410,20 @@ static int __write_init_super_ddf(struct supertype *st)
int successes = 0;
unsigned long long size;
char *null_aligned;
+ __u32 seq;
if (posix_memalign((void**)&null_aligned, 4096, NULL_CONF_SZ) != 0) {
return -ENOMEM;
}
memset(null_aligned, 0xff, NULL_CONF_SZ);
+ if (ddf->primary.seq != 0xffffffff)
+ seq = __cpu_to_be32(__be32_to_cpu(ddf->primary.seq)+1);
+ else if (ddf->secondary.seq != 0xffffffff)
+ seq = __cpu_to_be32(__be32_to_cpu(ddf->secondary.seq)+1);
+ else
+ seq = __cpu_to_be32(1);
+
/* try to write updated metadata,
* if we catch a failure move on to the next disk
*/
@@ -2448,12 +2457,12 @@ static int __write_init_super_ddf(struct supertype *st)
else
ddf->anchor.secondary_lba =
__cpu_to_be64(size - 32*1024*2);
- ddf->anchor.seq = __cpu_to_be32(1);
memcpy(&ddf->primary, &ddf->anchor, 512);
memcpy(&ddf->secondary, &ddf->anchor, 512);
ddf->anchor.openflag = 0xFF; /* 'open' means nothing */
ddf->anchor.seq = 0xFFFFFFFF; /* no sequencing in anchor */
+ ddf->secondary.seq = ddf->primary.seq = seq;
ddf->anchor.crc = calc_crc(&ddf->anchor, 512);
if (!__write_ddf_structure(d, ddf, DDF_HEADER_PRIMARY,
--
1.7.3.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 04/12] DDF: added other_bvd to struct vcl
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
` (2 preceding siblings ...)
2013-03-01 22:28 ` [PATCH 03/12] DDF: increase seq number when writing meta data mwilck
@ 2013-03-01 22:28 ` mwilck
2013-03-01 22:28 ` [PATCH 05/12] DDF: load_ddf_local: store VD conf for other BVDs mwilck
` (8 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: mwilck @ 2013-03-01 22:28 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: mwilck
The VD config structures of different BVDs in the same SVD may be
different. This pointer stores the other BVDs.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
---
super-ddf.c | 10 ++++++++++
1 files changed, 10 insertions(+), 0 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index e165927..8ec0afb 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -407,6 +407,7 @@ struct ddf_super {
__u64 *lba_offset; /* location in 'conf' of
* the lba table */
unsigned int vcnum; /* index into ->virt */
+ struct vd_config **other_bvds;
__u64 *block_sizes; /* NULL if all the same */
};
};
@@ -743,6 +744,12 @@ static int load_ddf_local(int fd, struct ddf_super *super,
}
vcl->next = super->conflist;
vcl->block_sizes = NULL; /* FIXME not for CONCAT */
+ if (vd->sec_elmnt_count > 1)
+ vcl->other_bvds =
+ xcalloc(vd->sec_elmnt_count - 1,
+ sizeof(struct vd_config *));
+ else
+ vcl->other_bvds = NULL;
super->conflist = vcl;
dl->vlist[vnum++] = vcl;
}
@@ -860,6 +867,8 @@ static void free_super_ddf(struct supertype *st)
ddf->conflist = v->next;
if (v->block_sizes)
free(v->block_sizes);
+ if (v->other_bvds)
+ free(v->other_bvds);
free(v);
}
while (ddf->dlist) {
@@ -2028,6 +2037,7 @@ static int init_super_ddf_bvd(struct supertype *st,
vcl->lba_offset = (__u64*) &vcl->conf.phys_refnum[ddf->mppe];
vcl->vcnum = venum;
vcl->block_sizes = NULL; /* FIXME not for CONCAT */
+ vcl->other_bvds = NULL;
vc = &vcl->conf;
--
1.7.3.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 05/12] DDF: load_ddf_local: store VD conf for other BVDs
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
` (3 preceding siblings ...)
2013-03-01 22:28 ` [PATCH 04/12] DDF: added other_bvd to struct vcl mwilck
@ 2013-03-01 22:28 ` mwilck
2013-03-01 22:28 ` [PATCH 06/12] DDF: container_content_ddf: change array disk search loop mwilck
` (7 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: mwilck @ 2013-03-01 22:28 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: mwilck
Store VD config for other BVDs in the other_bvds array.
This allows handling secondary RAID levels in container_content_ddf.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
---
super-ddf.c | 42 +++++++++++++++++++++++++++++++++++++++++-
1 files changed, 41 insertions(+), 1 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index 8ec0afb..5426d29 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -636,6 +636,36 @@ static int load_ddf_global(int fd, struct ddf_super *super, char *devname)
return 0;
}
+static void add_other_bvd(struct vcl *vcl, struct vd_config *vd,
+ unsigned int len)
+{
+ int i;
+ for (i = 0; i < vcl->conf.sec_elmnt_count-1; i++)
+ if (vcl->other_bvds[i] != NULL &&
+ vcl->other_bvds[i]->sec_elmnt_seq == vd->sec_elmnt_seq)
+ break;
+
+ if (i < vcl->conf.sec_elmnt_count-1) {
+ if (vd->seqnum <= vcl->other_bvds[i]->seqnum)
+ return;
+ } else {
+ for (i = 0; i < vcl->conf.sec_elmnt_count-1; i++)
+ if (vcl->other_bvds[i] == NULL)
+ break;
+ if (i == vcl->conf.sec_elmnt_count-1) {
+ pr_err("no space for sec level config %u, count is %u\n",
+ vd->sec_elmnt_seq, vcl->conf.sec_elmnt_count);
+ return;
+ }
+ if (posix_memalign((void **)&vcl->other_bvds[i], 512, len)
+ != 0) {
+ pr_err("%s could not allocate vd buf\n", __func__);
+ return;
+ }
+ }
+ memcpy(vcl->other_bvds[i], vd, len);
+}
+
static int load_ddf_local(int fd, struct ddf_super *super,
char *devname, int keep)
{
@@ -731,6 +761,11 @@ static int load_ddf_local(int fd, struct ddf_super *super,
if (vcl) {
dl->vlist[vnum++] = vcl;
+ if (vcl->other_bvds != NULL &&
+ vcl->conf.sec_elmnt_seq != vd->sec_elmnt_seq) {
+ add_other_bvd(vcl, vd, super->conf_rec_len*512);
+ continue;
+ }
if (__be32_to_cpu(vd->seqnum) <=
__be32_to_cpu(vcl->conf.seqnum))
continue;
@@ -867,8 +902,13 @@ static void free_super_ddf(struct supertype *st)
ddf->conflist = v->next;
if (v->block_sizes)
free(v->block_sizes);
- if (v->other_bvds)
+ if (v->other_bvds) {
+ int i;
+ for (i = 0; i < v->conf.sec_elmnt_count-1; i++)
+ if (v->other_bvds[i] != NULL)
+ free(v->other_bvds[i]);
free(v->other_bvds);
+ }
free(v);
}
while (ddf->dlist) {
--
1.7.3.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 06/12] DDF: container_content_ddf: change array disk search loop
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
` (4 preceding siblings ...)
2013-03-01 22:28 ` [PATCH 05/12] DDF: load_ddf_local: store VD conf for other BVDs mwilck
@ 2013-03-01 22:28 ` mwilck
2013-03-01 22:28 ` [PATCH 07/12] DDF: container_content_ddf: check for secondary RAID mwilck
` (6 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: mwilck @ 2013-03-01 22:28 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: mwilck
When searching for container elements, loop over the known phys
disks rather than the elements of the current configuration.
This patch changes nothing in the logic or return value of the code.
It just prepares extended logic for handling RAID10.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
---
super-ddf.c | 33 +++++++++++++++++++++------------
1 files changed, 21 insertions(+), 12 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index 5426d29..a5080dd 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -3051,6 +3051,17 @@ static int load_container_ddf(struct supertype *st, int fd,
#endif /* MDASSEMBLE */
+#define NO_SUCH_REFNUM (0xFFFFFFFF)
+static unsigned int get_pd_index_from_refnum(const struct vcl *vc,
+ __u32 refnum, unsigned int nmax)
+{
+ unsigned int i;
+ for (i = 0 ; i < nmax ; i++)
+ if (vc->conf.phys_refnum[i] == refnum)
+ return i;
+ return NO_SUCH_REFNUM;
+}
+
static struct mdinfo *container_content_ddf(struct supertype *st, char *subarray)
{
/* Given a container loaded by load_super_ddf_all,
@@ -3072,6 +3083,7 @@ static struct mdinfo *container_content_ddf(struct supertype *st, char *subarray
struct mdinfo *this;
char *ep;
__u32 *cptr;
+ unsigned int pd;
if (subarray &&
(strtoul(subarray, &ep, 10) != vc->vcnum ||
@@ -3125,21 +3137,12 @@ static struct mdinfo *container_content_ddf(struct supertype *st, char *subarray
devnum2devname(st->container_dev),
this->container_member);
- for (i = 0 ; i < ddf->mppe ; i++) {
+ for (pd = 0; pd < __be16_to_cpu(ddf->phys->used_pdes); pd++) {
struct mdinfo *dev;
struct dl *d;
int stt;
- int pd;
- if (vc->conf.phys_refnum[i] == 0xFFFFFFFF)
- continue;
-
- for (pd = __be16_to_cpu(ddf->phys->used_pdes);
- pd--;)
- if (ddf->phys->entries[pd].refnum
- == vc->conf.phys_refnum[i])
- break;
- if (pd < 0)
+ if (ddf->phys->entries[pd].refnum == 0xFFFFFFFF)
continue;
stt = __be16_to_cpu(ddf->phys->entries[pd].state);
@@ -3147,10 +3150,16 @@ static struct mdinfo *container_content_ddf(struct supertype *st, char *subarray
!= DDF_Online)
continue;
+ i = get_pd_index_from_refnum(
+ vc, ddf->phys->entries[pd].refnum, ddf->mppe);
+ if (i == NO_SUCH_REFNUM)
+ continue;
+
this->array.working_disks++;
for (d = ddf->dlist; d ; d=d->next)
- if (d->disk.refnum == vc->conf.phys_refnum[i])
+ if (d->disk.refnum ==
+ ddf->phys->entries[pd].refnum)
break;
if (d == NULL)
/* Haven't found that one yet, maybe there are others */
--
1.7.3.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 07/12] DDF: container_content_ddf: check for secondary RAID
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
` (5 preceding siblings ...)
2013-03-01 22:28 ` [PATCH 06/12] DDF: container_content_ddf: change array disk search loop mwilck
@ 2013-03-01 22:28 ` mwilck
2013-03-01 22:28 ` [PATCH 08/12] DDF: container_content_ddf: handle RAID layout for RAID10 mwilck
` (5 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: mwilck @ 2013-03-01 22:28 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: mwilck
Check for supportable secondary RAID configurations.
There is currently only one: RAID 10, if the stripe
sizes and Basic volume sizes are all equal.
With this patch, mdadm will not try to start unsupported
secondary RAID level configurations any more.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
---
super-ddf.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 73 insertions(+), 0 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index a5080dd..4186038 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -3051,6 +3051,74 @@ static int load_container_ddf(struct supertype *st, int fd,
#endif /* MDASSEMBLE */
+static int check_secondary(const struct vcl *vc)
+{
+ const struct vd_config *conf = &vc->conf;
+ int i;
+
+ /* The only DDF secondary RAID level md can support is
+ * RAID 10, if the stripe sizes and Basic volume sizes
+ * are all equal.
+ * Other configurations could in theory be supported by exposing
+ * the BVDs to user space and using device mapper for the secondary
+ * mapping. So far we don't support that.
+ */
+
+ __u64 sec_elements[4] = {0, 0, 0, 0};
+#define __set_sec_seen(n) (sec_elements[(n)>>6] |= (1<<((n)&63)))
+#define __was_sec_seen(n) ((sec_elements[(n)>>6] & (1<<((n)&63))) != 0)
+
+ if (vc->other_bvds == NULL) {
+ pr_err("No BVDs for secondary RAID found\n");
+ return -1;
+ }
+ if (conf->prl != DDF_RAID1) {
+ pr_err("Secondary RAID level only supported for mirrored BVD\n");
+ return -1;
+ }
+ if (conf->srl != DDF_2STRIPED && conf->srl != DDF_2SPANNED) {
+ pr_err("Secondary RAID level %d is unsupported\n",
+ conf->srl);
+ return -1;
+ }
+ __set_sec_seen(conf->sec_elmnt_seq);
+ for (i = 0; i < conf->sec_elmnt_count-1; i++) {
+ const struct vd_config *bvd = vc->other_bvds[i];
+ if (bvd == NULL) {
+ pr_err("BVD %d is missing", i+1);
+ return -1;
+ }
+ if (bvd->srl != conf->srl) {
+ pr_err("Inconsistent secondary RAID level across BVDs\n");
+ return -1;
+ }
+ if (bvd->prl != conf->prl) {
+ pr_err("Different RAID levels for BVDs are unsupported\n");
+ return -1;
+ }
+ if (bvd->prim_elmnt_count != conf->prim_elmnt_count) {
+ pr_err("All BVDs must have the same number of primary elements\n");
+ return -1;
+ }
+ if (bvd->chunk_shift != conf->chunk_shift) {
+ pr_err("Different strip sizes for BVDs are unsupported\n");
+ return -1;
+ }
+ if (bvd->array_blocks != conf->array_blocks) {
+ pr_err("Different BVD sizes are unsupported\n");
+ return -1;
+ }
+ __set_sec_seen(bvd->sec_elmnt_seq);
+ }
+ for (i = 0; i < conf->sec_elmnt_count; i++) {
+ if (!__was_sec_seen(i)) {
+ pr_err("BVD %d is missing\n", i);
+ return -1;
+ }
+ }
+ return 0;
+}
+
#define NO_SUCH_REFNUM (0xFFFFFFFF)
static unsigned int get_pd_index_from_refnum(const struct vcl *vc,
__u32 refnum, unsigned int nmax)
@@ -3090,6 +3158,11 @@ static struct mdinfo *container_content_ddf(struct supertype *st, char *subarray
*ep != '\0'))
continue;
+ if (vc->conf.sec_elmnt_count > 1) {
+ if (check_secondary(vc) != 0)
+ continue;
+ }
+
this = xcalloc(1, sizeof(*this));
this->next = rest;
rest = this;
--
1.7.3.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 08/12] DDF: container_content_ddf: handle RAID layout for RAID10
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
` (6 preceding siblings ...)
2013-03-01 22:28 ` [PATCH 07/12] DDF: container_content_ddf: check for secondary RAID mwilck
@ 2013-03-01 22:28 ` mwilck
2013-03-01 22:28 ` [PATCH 09/12] DDF: __write_init_super_ddf: use correct VD conf mwilck
` (4 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: mwilck @ 2013-03-01 22:28 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: mwilck
This patch adds basic handling for the special case of RAID10.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
---
super-ddf.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++---------
1 files changed, 66 insertions(+), 13 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index 4186038..01aa7d5 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -3121,12 +3121,45 @@ static int check_secondary(const struct vcl *vc)
#define NO_SUCH_REFNUM (0xFFFFFFFF)
static unsigned int get_pd_index_from_refnum(const struct vcl *vc,
- __u32 refnum, unsigned int nmax)
+ __u32 refnum, unsigned int nmax,
+ const struct vd_config **bvd,
+ unsigned int *idx)
{
- unsigned int i;
- for (i = 0 ; i < nmax ; i++)
- if (vc->conf.phys_refnum[i] == refnum)
- return i;
+ unsigned int i, j, n, sec, cnt;
+
+ cnt = __be16_to_cpu(vc->conf.prim_elmnt_count);
+ sec = (vc->conf.sec_elmnt_count == 1 ? 0 : vc->conf.sec_elmnt_seq);
+
+ for (i = 0, j = 0 ; i < nmax ; i++) {
+ /* j counts valid entries for this BVD */
+ if (vc->conf.phys_refnum[i] != 0xffffffff)
+ j++;
+ if (vc->conf.phys_refnum[i] == refnum) {
+ *bvd = &vc->conf;
+ *idx = i;
+ return sec * cnt + j - 1;
+ }
+ }
+ if (vc->other_bvds == NULL)
+ goto bad;
+
+ for (n = 1; n < vc->conf.sec_elmnt_count; n++) {
+ struct vd_config *vd = vc->other_bvds[n-1];
+ if (vd == NULL)
+ continue;
+ sec = vd->sec_elmnt_seq;
+ for (i = 0, j = 0 ; i < nmax ; i++) {
+ if (vd->phys_refnum[i] != 0xffffffff)
+ j++;
+ if (vd->phys_refnum[i] == refnum) {
+ *bvd = vd;
+ *idx = i;
+ return sec * cnt + j - 1;
+ }
+ }
+ }
+bad:
+ *bvd = NULL;
return NO_SUCH_REFNUM;
}
@@ -3167,11 +3200,26 @@ static struct mdinfo *container_content_ddf(struct supertype *st, char *subarray
this->next = rest;
rest = this;
- this->array.level = map_num1(ddf_level_num, vc->conf.prl);
- this->array.raid_disks =
- __be16_to_cpu(vc->conf.prim_elmnt_count);
- this->array.layout = rlq_to_layout(vc->conf.rlq, vc->conf.prl,
- this->array.raid_disks);
+ if (vc->conf.sec_elmnt_count == 1) {
+ this->array.level = map_num1(ddf_level_num,
+ vc->conf.prl);
+ this->array.raid_disks =
+ __be16_to_cpu(vc->conf.prim_elmnt_count);
+ this->array.layout =
+ rlq_to_layout(vc->conf.rlq, vc->conf.prl,
+ this->array.raid_disks);
+ } else {
+ /* The only supported layout is RAID 10.
+ * Compatibility has been checked in check_secondary()
+ * above.
+ */
+ this->array.level = 10;
+ this->array.raid_disks =
+ __be16_to_cpu(vc->conf.prim_elmnt_count)
+ * vc->conf.sec_elmnt_count;
+ this->array.layout = 0x100 |
+ __be16_to_cpu(vc->conf.prim_elmnt_count);
+ }
this->array.md_minor = -1;
this->array.major_version = -1;
this->array.minor_version = -2;
@@ -3213,6 +3261,9 @@ static struct mdinfo *container_content_ddf(struct supertype *st, char *subarray
for (pd = 0; pd < __be16_to_cpu(ddf->phys->used_pdes); pd++) {
struct mdinfo *dev;
struct dl *d;
+ const struct vd_config *bvd;
+ unsigned int iphys;
+ __u64 *lba_offset;
int stt;
if (ddf->phys->entries[pd].refnum == 0xFFFFFFFF)
@@ -3224,7 +3275,8 @@ static struct mdinfo *container_content_ddf(struct supertype *st, char *subarray
continue;
i = get_pd_index_from_refnum(
- vc, ddf->phys->entries[pd].refnum, ddf->mppe);
+ vc, ddf->phys->entries[pd].refnum,
+ ddf->mppe, &bvd, &iphys);
if (i == NO_SUCH_REFNUM)
continue;
@@ -3250,8 +3302,9 @@ static struct mdinfo *container_content_ddf(struct supertype *st, char *subarray
dev->recovery_start = MaxSector;
dev->events = __be32_to_cpu(ddf->primary.seq);
- dev->data_offset = __be64_to_cpu(vc->lba_offset[i]);
- dev->component_size = __be64_to_cpu(vc->conf.blocks);
+ lba_offset = (__u64 *)&bvd->phys_refnum[ddf->mppe];
+ dev->data_offset = __be64_to_cpu(lba_offset[iphys]);
+ dev->component_size = __be64_to_cpu(bvd->blocks);
if (d->devname)
strcpy(dev->name, d->devname);
}
--
1.7.3.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 09/12] DDF: __write_init_super_ddf: use correct VD conf
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
` (7 preceding siblings ...)
2013-03-01 22:28 ` [PATCH 08/12] DDF: container_content_ddf: handle RAID layout for RAID10 mwilck
@ 2013-03-01 22:28 ` mwilck
2013-03-01 22:28 ` [PATCH 10/12] DDF: add sanity checks in compare_super_ddf mwilck
` (3 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: mwilck @ 2013-03-01 22:28 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: mwilck
When writing back the DDF structure, make sure that on each disk
we write the configs that include this disk even if a secondary
RAID level is present. Otherwise the secondary RAID will not be
read correctly any more when we open the device next time.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
---
super-ddf.c | 31 ++++++++++++++++++++++++-------
1 files changed, 24 insertions(+), 7 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index 01aa7d5..4c3e6f4 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -2377,6 +2377,11 @@ static int remove_from_super_ddf(struct supertype *st, mdu_disk_info_t *dk)
*/
#define NULL_CONF_SZ 4096
+static unsigned int get_pd_index_from_refnum(const struct vcl *vc,
+ __u32 refnum, unsigned int nmax,
+ const struct vd_config **bvd,
+ unsigned int *idx);
+
static int __write_ddf_structure(struct dl *d, struct ddf_super *ddf, __u8 type,
char *null_aligned)
{
@@ -2422,14 +2427,26 @@ static int __write_ddf_structure(struct dl *d, struct ddf_super *ddf, __u8 type,
n_config = ddf->max_part;
conf_size = ddf->conf_rec_len * 512;
for (i = 0 ; i <= n_config ; i++) {
- struct vcl *c = d->vlist[i];
- if (i == n_config)
+ struct vcl *c;
+ struct vd_config *vdc = NULL;
+ if (i == n_config) {
c = (struct vcl *)d->spare;
-
+ if (c)
+ vdc = &c->conf;
+ } else {
+ unsigned int dummy;
+ c = d->vlist[i];
+ if (c)
+ get_pd_index_from_refnum(
+ c, d->disk.refnum,
+ ddf->mppe,
+ (const struct vd_config **)&vdc,
+ &dummy);
+ }
if (c) {
- c->conf.seqnum = ddf->primary.seq;
- c->conf.crc = calc_crc(&c->conf, conf_size);
- if (write(fd, &c->conf, conf_size) < 0)
+ vdc->seqnum = ddf->primary.seq;
+ vdc->crc = calc_crc(vdc, conf_size);
+ if (write(fd, vdc, conf_size) < 0)
break;
} else {
unsigned int togo = conf_size;
@@ -3085,7 +3102,7 @@ static int check_secondary(const struct vcl *vc)
for (i = 0; i < conf->sec_elmnt_count-1; i++) {
const struct vd_config *bvd = vc->other_bvds[i];
if (bvd == NULL) {
- pr_err("BVD %d is missing", i+1);
+ pr_err("BVD %d is missing\n", i+1);
return -1;
}
if (bvd->srl != conf->srl) {
--
1.7.3.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 10/12] DDF: add sanity checks in compare_super_ddf
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
` (8 preceding siblings ...)
2013-03-01 22:28 ` [PATCH 09/12] DDF: __write_init_super_ddf: use correct VD conf mwilck
@ 2013-03-01 22:28 ` mwilck
2013-03-01 22:28 ` [PATCH 11/12] DDF: compare_super_ddf: merge local info of other superblock mwilck
` (2 subsequent siblings)
12 siblings, 0 replies; 16+ messages in thread
From: mwilck @ 2013-03-01 22:28 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: mwilck
Besides container GUID, also check seqnum, physical and virtual
disk numbers, and check match between local and global sections.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
---
super-ddf.c | 42 ++++++++++++++++++++++++++++++++++++++++++
1 files changed, 42 insertions(+), 0 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index 4c3e6f4..56c9721 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -3371,6 +3371,9 @@ static int compare_super_ddf(struct supertype *st, struct supertype *tst)
*/
struct ddf_super *first = st->sb;
struct ddf_super *second = tst->sb;
+ struct dl *dl2;
+ struct vcl *vl2;
+ unsigned int max_vds, max_pds, pd, vd;
if (!first) {
st->sb = tst->sb;
@@ -3381,7 +3384,46 @@ static int compare_super_ddf(struct supertype *st, struct supertype *tst)
if (memcmp(first->anchor.guid, second->anchor.guid, DDF_GUID_LEN) != 0)
return 2;
+ if (first->anchor.seq != second->anchor.seq) {
+ dprintf("%s: sequence number mismatch %u/%u\n", __func__,
+ __be32_to_cpu(first->anchor.seq),
+ __be32_to_cpu(second->anchor.seq));
+ return 3;
+ }
+ if (first->max_part != second->max_part ||
+ first->phys->used_pdes != second->phys->used_pdes ||
+ first->virt->populated_vdes != second->virt->populated_vdes) {
+ dprintf("%s: PD/VD number mismatch\n", __func__);
+ return 3;
+ }
+
+ max_pds = __be16_to_cpu(first->phys->used_pdes);
+ for (dl2 = second->dlist; dl2; dl2 = dl2->next) {
+ for (pd = 0; pd < max_pds; pd++)
+ if (first->phys->entries[pd].refnum == dl2->disk.refnum)
+ break;
+ if (pd == max_pds) {
+ dprintf("%s: no match for disk %08x\n", __func__,
+ __be32_to_cpu(dl2->disk.refnum));
+ return 3;
+ }
+ }
+
+ max_vds = __be16_to_cpu(first->active->max_vd_entries);
+ for (vl2 = second->conflist; vl2; vl2 = vl2->next) {
+ if (vl2->conf.magic != DDF_VD_CONF_MAGIC)
+ continue;
+ for (vd = 0; vd < max_vds; vd++)
+ if (!memcmp(first->virt->entries[vd].guid,
+ vl2->conf.guid, DDF_GUID_LEN))
+ break;
+ if (vd == max_vds) {
+ dprintf("%s: no match for VD config\n", __func__);
+ return 3;
+ }
+ }
/* FIXME should I look at anything else? */
+
return 0;
}
--
1.7.3.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 11/12] DDF: compare_super_ddf: merge local info of other superblock
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
` (9 preceding siblings ...)
2013-03-01 22:28 ` [PATCH 10/12] DDF: add sanity checks in compare_super_ddf mwilck
@ 2013-03-01 22:28 ` mwilck
2013-03-01 22:28 ` [PATCH 12/12] Detail.c: call load_container for container subarrays mwilck
2013-03-04 5:22 ` DDF / RAID10 patch series for mdadm NeilBrown
12 siblings, 0 replies; 16+ messages in thread
From: mwilck @ 2013-03-01 22:28 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: mwilck
If a match is found in compare_super_ddf, check the other SB
for local DDF information (VD config records, physical disk data)
which is not available in the current superblock, and add it
if needed.
This is important for the mdmon - when disks are added to a
auto read-only array, they must be present in the DDF structure
in order to guarantee consistent writeback of metadata to all
disks.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
---
super-ddf.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 100 insertions(+), 2 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index 56c9721..dbea77f 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -3371,8 +3371,8 @@ static int compare_super_ddf(struct supertype *st, struct supertype *tst)
*/
struct ddf_super *first = st->sb;
struct ddf_super *second = tst->sb;
- struct dl *dl2;
- struct vcl *vl2;
+ struct dl *dl1, *dl2;
+ struct vcl *vl1, *vl2;
unsigned int max_vds, max_pds, pd, vd;
if (!first) {
@@ -3424,6 +3424,104 @@ static int compare_super_ddf(struct supertype *st, struct supertype *tst)
}
/* FIXME should I look at anything else? */
+ /*
+ At this point we are fairly sure that the meta data matches.
+ But the new disk may contain additional local data.
+ Add it to the super block.
+ */
+ for (vl2 = second->conflist; vl2; vl2 = vl2->next) {
+ for (vl1 = first->conflist; vl1; vl1 = vl1->next)
+ if (!memcmp(vl1->conf.guid, vl2->conf.guid,
+ DDF_GUID_LEN))
+ break;
+ if (vl1) {
+ if (vl1->other_bvds != NULL &&
+ vl1->conf.sec_elmnt_seq !=
+ vl2->conf.sec_elmnt_seq) {
+ dprintf("%s: adding BVD %u\n", __func__,
+ vl2->conf.sec_elmnt_seq);
+ add_other_bvd(vl1, &vl2->conf,
+ first->conf_rec_len*512);
+ }
+ continue;
+ }
+
+ if (posix_memalign((void **)&vl1, 512,
+ (first->conf_rec_len*512 +
+ offsetof(struct vcl, conf))) != 0) {
+ pr_err("%s could not allocate vcl buf\n",
+ __func__);
+ return 3;
+ }
+
+ vl1->next = first->conflist;
+ vl1->block_sizes = NULL;
+ if (vl1->conf.sec_elmnt_count > 1) {
+ vl1->other_bvds = xcalloc(vl2->conf.sec_elmnt_count - 1,
+ sizeof(struct vd_config *));
+ } else
+ vl1->other_bvds = NULL;
+ memcpy(&vl1->conf, &vl2->conf, first->conf_rec_len*512);
+ vl1->lba_offset = (__u64 *)
+ &vl1->conf.phys_refnum[first->mppe];
+ for (vd = 0; vd < max_vds; vd++)
+ if (!memcmp(first->virt->entries[vd].guid,
+ vl1->conf.guid, DDF_GUID_LEN))
+ break;
+ vl1->vcnum = vd;
+ dprintf("%s: added config for VD %u\n", __func__, vl1->vcnum);
+ first->conflist = vl1;
+ }
+
+ for (dl2 = second->dlist; dl2; dl2 = dl2->next) {
+ for (dl1 = first->dlist; dl1; dl1 = dl1->next)
+ if (dl1->disk.refnum == dl2->disk.refnum)
+ break;
+ if (dl1)
+ continue;
+
+ if (posix_memalign((void **)&dl1, 512,
+ sizeof(*dl1) + (first->max_part) * sizeof(dl1->vlist[0]))
+ != 0) {
+ pr_err("%s could not allocate disk info buffer\n",
+ __func__);
+ return 3;
+ }
+ memcpy(dl1, dl2, sizeof(*dl1));
+ dl1->mdupdate = NULL;
+ dl1->next = first->dlist;
+ dl1->fd = -1;
+ for (pd = 0; pd < max_pds; pd++)
+ if (first->phys->entries[pd].refnum == dl1->disk.refnum)
+ break;
+ dl1->pdnum = pd;
+ if (dl2->spare) {
+ if (posix_memalign((void **)&dl1->spare, 512,
+ first->conf_rec_len*512) != 0) {
+ pr_err("%s could not allocate spare info buf\n",
+ __func__);
+ return 3;
+ }
+ memcpy(dl1->spare, dl2->spare, first->conf_rec_len*512);
+ }
+ for (vd = 0 ; vd < first->max_part ; vd++) {
+ if (!dl2->vlist[vd]) {
+ dl1->vlist[vd] = NULL;
+ continue;
+ }
+ for (vl1 = first->conflist; vl1; vl1 = vl1->next) {
+ if (!memcmp(vl1->conf.guid,
+ dl2->vlist[vd]->conf.guid,
+ DDF_GUID_LEN))
+ break;
+ dl1->vlist[vd] = vl1;
+ }
+ }
+ first->dlist = dl1;
+ dprintf("%s: added disk %d: %08x\n", __func__, dl1->pdnum,
+ dl1->disk.refnum);
+ }
+
return 0;
}
--
1.7.3.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 12/12] Detail.c: call load_container for container subarrays
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
` (10 preceding siblings ...)
2013-03-01 22:28 ` [PATCH 11/12] DDF: compare_super_ddf: merge local info of other superblock mwilck
@ 2013-03-01 22:28 ` mwilck
2013-03-02 7:47 ` Paul Menzel
2013-03-04 5:22 ` DDF / RAID10 patch series for mdadm NeilBrown
12 siblings, 1 reply; 16+ messages in thread
From: mwilck @ 2013-03-01 22:28 UTC (permalink / raw)
To: neilb, linux-raid; +Cc: mwilck
Without calling load_container at this point, the
info structure may be missing some important information.
In particular, information about secondary DDF RAID levels
may be wrong if information is only read from a single disk.
If this fails, fall back to the previous code.
Signed-off-by: Martin Wilck <mwilck@arcor.de>
---
Detail.c | 10 +++++++++-
1 files changed, 9 insertions(+), 1 deletions(-)
diff --git a/Detail.c b/Detail.c
index ab49803..9e52de8 100644
--- a/Detail.c
+++ b/Detail.c
@@ -103,13 +103,21 @@ int Detail(char *dev, struct context *c)
* We want the name of the container, and the member
*/
int dn = st->container_dev;
+ int cfd, err;
member = subarray;
container = map_dev_preferred(dev2major(dn), dev2minor(dn), 1, c->prefer);
+ cfd = open_dev(st->container_dev);
+ if (cfd >= 0) {
+ err = st->ss->load_container(st, cfd, NULL);
+ close(cfd);
+ if (err == 0)
+ info = st->ss->container_content(st, subarray);
+ }
}
/* try to load a superblock */
- if (st) for (d = 0; d < max_disks; d++) {
+ if (st && !info) for (d = 0; d < max_disks; d++) {
mdu_disk_info_t disk;
char *dv;
int fd2;
--
1.7.3.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 12/12] Detail.c: call load_container for container subarrays
2013-03-01 22:28 ` [PATCH 12/12] Detail.c: call load_container for container subarrays mwilck
@ 2013-03-02 7:47 ` Paul Menzel
0 siblings, 0 replies; 16+ messages in thread
From: Paul Menzel @ 2013-03-02 7:47 UTC (permalink / raw)
To: mwilck; +Cc: neilb, linux-raid
[-- Attachment #1: Type: text/plain, Size: 1517 bytes --]
Dear Martin,
thank you for your patches?
Am Freitag, den 01.03.2013, 23:28 +0100 schrieb mwilck@arcor.de:
> Without calling load_container at this point, the
> info structure may be missing some important information.
> In particular, information about secondary DDF RAID levels
> may be wrong if information is only read from a single disk.
>
> If this fails, fall back to the previous code.
>
> Signed-off-by: Martin Wilck <mwilck@arcor.de>
> ---
> Detail.c | 10 +++++++++-
> 1 files changed, 9 insertions(+), 1 deletions(-)
>
> diff --git a/Detail.c b/Detail.c
> index ab49803..9e52de8 100644
> --- a/Detail.c
> +++ b/Detail.c
> @@ -103,13 +103,21 @@ int Detail(char *dev, struct context *c)
> * We want the name of the container, and the member
> */
> int dn = st->container_dev;
> + int cfd, err;
>
> member = subarray;
> container = map_dev_preferred(dev2major(dn), dev2minor(dn), 1, c->prefer);
> + cfd = open_dev(st->container_dev);
> + if (cfd >= 0) {
> + err = st->ss->load_container(st, cfd, NULL);
> + close(cfd);
> + if (err == 0)
> + info = st->ss->container_content(st, subarray);
> + }
> }
>
> /* try to load a superblock */
> - if (st) for (d = 0; d < max_disks; d++) {
> + if (st && !info) for (d = 0; d < max_disks; d++) {
Is it ensured that `info` has been always NULL without your patch? Maybe
also add a comment too.
> mdu_disk_info_t disk;
> char *dv;
> int fd2;
Thanks,
Paul
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: DDF / RAID10 patch series for mdadm
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
` (11 preceding siblings ...)
2013-03-01 22:28 ` [PATCH 12/12] Detail.c: call load_container for container subarrays mwilck
@ 2013-03-04 5:22 ` NeilBrown
2013-03-06 18:26 ` Martin Wilck
12 siblings, 1 reply; 16+ messages in thread
From: NeilBrown @ 2013-03-04 5:22 UTC (permalink / raw)
To: mwilck; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 4379 bytes --]
On Fri, 1 Mar 2013 23:28:21 +0100 mwilck@arcor.de wrote:
> Hello Neil, hello everybody,
>
> I am working on improved DDF support in mdadm. I would be grateful
> for a review on this patch series. The main new feature is support for
> RAID 10 in DDF, such that md is able to interoperate with an LSI
> Megaraid Software RAID driver on a RAID10 array.
>
> The DDF/RAID10 support is not feature complete, but usable. Both
> assembly and incremental assembly work as expected. I verified that
> the 2+2 layout maps cleanly to that used by the LSI SW RAID stack. Meta
> data are saved cleanly and interpreted correctly by the LSI stack.
> What is yet missing is code to handle disk failures and additions.
> I would appreciate some guidance in that area, in particular how to test it.
>
> In the DDF Spec, RAID10 is a special case of the broad "secondary RAID
> level" concept. The intersection between DDF RAID 10 and md RAID 10 is
> tiny - just the "near" layout with an even number of disks. Fortunately this
> is also the only type of "secondary RAID" supported by my LSI stack,
> and generally the most common multilevel RAID, AFAICT.
>
> The DDF 2nd level RAID concept would be matched more closely by a
> different approach where md only creates the BVDs and the secondary
> RAID is implemented with the device mapper on top of it. But I opted
> for the direct mapping of DDF RAID10 to md RAID10 for several reasons:
> 1) dm wouldn't handle the generic "striped" and "spanned" secondary
> levels out-of-the box, either. 2) I wanted to take full benefit of the
> advanced features and performance of md RAID10 (it actually performed
> better than the LSI stack in a few simple benchmarks). 3) It is
> simpler to handle this way. 4) This is non-exclusive, code could still
> be added later to use dm for more "exotic" secondary DDF RAID levels.
>
> The series stars out with 3 patches I already submitted a while
> ago. They fix interoperability related to meta data handling, in particular
> DDF header locations and Sequential numbers. They are unrelated to RAID10.
>
> Patch 4..9 introduce support for DDF RAID 10. The current mdadm
> implementation of DDF ignores secondary level completely and stores
> only the BVD information in the super block, one BVD for every DDF
> "Virtual disk configuration record". I had to add an additional data
> structure to hold information about the "other BVDs" belonging to a
> second level RAID set. The most intersting stuff happens in
> container_content_ddf, which translates the DDF RAID 10 setup into md
> structures.
>
> Patch 10 adds some missing sanity checks in compare_super_ddf. Patch 11
> extends compare_super_ddf() such that information from the new disk that
> is missing in the current superblock is added. This fixes a problem where
> mdmon would save updated meta data only on those disks that were present
> while mdadm was started, not on disks added later.
>
> Finally, patch 12 is a small improvement to Detail() that results in
> better output of mdadm -D for complex DDF setups.
>
> Regards
> Martin
>
> Martin Wilck (12):
> DDF: cleanly save the secondary DDF structure
> DDF: use existing locations for primary and secondary DDF structure
> DDF: increase seq number when writing meta data
> DDF: added other_bvd to struct vcl
> DDF: load_ddf_local: store VD conf for other BVDs
> DDF: container_content_ddf: change array disk search loop
> DDF: container_content_ddf: check for secondary RAID
> DDF: container_content_ddf: handle RAID layout for RAID10
> DDF: __write_init_super_ddf: use correct VD conf
> DDF: add sanity checks in compare_super_ddf
> DDF: compare_super_ddf: merge local info of other superblock
> Detail.c: call load_container for container subarrays
>
> Detail.c | 10 +-
> super-ddf.c | 556 ++++++++++++++++++++++++++++++++++++++++++++++++++---------
> 2 files changed, 484 insertions(+), 82 deletions(-)
>
Wow! Thanks for these! Happy to see DDF getting some attention.
I've applied all the patches (with a few minor cosmetic changes). I haven't
studied them all very closely, but I what I did examine looked good.
If you could add some tests to the 'tests' directory to make sure that
ddf/raid10 keeps working that would be great.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: DDF / RAID10 patch series for mdadm
2013-03-04 5:22 ` DDF / RAID10 patch series for mdadm NeilBrown
@ 2013-03-06 18:26 ` Martin Wilck
0 siblings, 0 replies; 16+ messages in thread
From: Martin Wilck @ 2013-03-06 18:26 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
Hi Neil,
> Wow! Thanks for these! Happy to see DDF getting some attention.
>
> I've applied all the patches (with a few minor cosmetic changes). I haven't
> studied them all very closely, but I what I did examine looked good.
>
> If you could add some tests to the 'tests' directory to make sure that
> ddf/raid10 keeps working that would be great.
Thanks a lot for applying. I'll have to study the test code to come up
with test cases. There'll be some work to be done for Create first (not
everyone has a BIOS RAID available for that).
I am currently working on a second patch series that deals with failures
and metadata updates.
Martin
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2013-03-06 18:26 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-01 22:28 DDF / RAID10 patch series for mdadm mwilck
2013-03-01 22:28 ` [PATCH 01/12] DDF: cleanly save the secondary DDF structure mwilck
2013-03-01 22:28 ` [PATCH 02/12] DDF: use existing locations for primary and " mwilck
2013-03-01 22:28 ` [PATCH 03/12] DDF: increase seq number when writing meta data mwilck
2013-03-01 22:28 ` [PATCH 04/12] DDF: added other_bvd to struct vcl mwilck
2013-03-01 22:28 ` [PATCH 05/12] DDF: load_ddf_local: store VD conf for other BVDs mwilck
2013-03-01 22:28 ` [PATCH 06/12] DDF: container_content_ddf: change array disk search loop mwilck
2013-03-01 22:28 ` [PATCH 07/12] DDF: container_content_ddf: check for secondary RAID mwilck
2013-03-01 22:28 ` [PATCH 08/12] DDF: container_content_ddf: handle RAID layout for RAID10 mwilck
2013-03-01 22:28 ` [PATCH 09/12] DDF: __write_init_super_ddf: use correct VD conf mwilck
2013-03-01 22:28 ` [PATCH 10/12] DDF: add sanity checks in compare_super_ddf mwilck
2013-03-01 22:28 ` [PATCH 11/12] DDF: compare_super_ddf: merge local info of other superblock mwilck
2013-03-01 22:28 ` [PATCH 12/12] Detail.c: call load_container for container subarrays mwilck
2013-03-02 7:47 ` Paul Menzel
2013-03-04 5:22 ` DDF / RAID10 patch series for mdadm NeilBrown
2013-03-06 18:26 ` Martin Wilck
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).