* [PATCH v2 0/6] mdadm support for journal device of RAID-4/5/6
@ 2015-10-09 5:51 Song Liu
2015-10-09 5:51 ` [PATCH v2 1/6] add macros for MD_DISK_ROLE_(SPARE/FAULTY) Song Liu
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Song Liu @ 2015-10-09 5:51 UTC (permalink / raw)
To: linux-raid; +Cc: neilb, shli, hch, dan.j.williams, kernel-team, Song Liu
Hi,
These are v2 of mdadm patches to support journal device in RAID-4/5/6.
Shaohua has sent latest kernel patches earlier today
http://marc.info/?l=linux-raid&m=144436646730381
Most of these patches are very close to v1, except --assemble, where
we improved checks for journal devices.
The following are copied from v1 pathes.
These patches add write journal support for the following commands:
mdadm --detail
mdadm --examine
mdadm --create --write-journal DEVICE
mdadm --assemble
mdadm --incremental
Journal device is assigned with dev_role 0xFFFD (where 0xFFFF is for
spare and 0xFFFE is for failed). Note that there is compatibility
issue that older mdadm will show journal device as spare in --detail:
Number Major Minor RaidDevice State
0 8 32 0 active sync /dev/sdc
1 8 48 1 active sync /dev/sdd
2 8 64 2 active sync /dev/sde
3 8 80 3 active sync /dev/sdf
4 8 17 - spare /dev/sdb1
Also, older mdadm will show journal device as "Active device 65533"
in --examine:
Device Role : Active device 65533
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
Song Liu (6):
add macros for MD_DISK_ROLE_(SPARE/FAULTY)
Show device as journal in --detail --examine
Enable create array with write journal (--write-journal DEVICE).
Assemble array with write journal
Check write journal in incremental
Add help message and man entry for --write-journal
Assemble.c | 57 ++++++++++++++++++-----
Create.c | 20 +++++---
Detail.c | 3 +-
Incremental.c | 31 +++++++++++--
ReadMe.c | 2 +
md_p.h | 64 ++++++++++++++++++++++++++
mdadm.8.in | 6 +++
mdadm.c | 23 ++++++++++
mdadm.h | 5 ++
super1.c | 144 ++++++++++++++++++++++++++++++++++++++++++++++++++--------
10 files changed, 313 insertions(+), 42 deletions(-)
--
2.4.6
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/6] add macros for MD_DISK_ROLE_(SPARE/FAULTY)
2015-10-09 5:51 [PATCH v2 0/6] mdadm support for journal device of RAID-4/5/6 Song Liu
@ 2015-10-09 5:51 ` Song Liu
2015-10-09 5:51 ` [PATCH v2 2/6] Show device as journal in --detail --examine Song Liu
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Song Liu @ 2015-10-09 5:51 UTC (permalink / raw)
To: linux-raid; +Cc: neilb, shli, hch, dan.j.williams, kernel-team, Song Liu
Replace special disk roles (0xffff, 0xfffe) with macros:
define MD_DISK_ROLE_SPARE 0xffff
define MD_DISK_ROLE_FAULTY 0xfffe
Will add macro for journal device in next patch:
define MD_DISK_ROLE_JOURNAL 0xfffd
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
---
md_p.h | 4 ++++
super1.c | 30 +++++++++++++++---------------
2 files changed, 19 insertions(+), 15 deletions(-)
diff --git a/md_p.h b/md_p.h
index 9b6b5f8..3a3b8af 100644
--- a/md_p.h
+++ b/md_p.h
@@ -92,6 +92,10 @@
#define MD_DISK_REPLACEMENT 17
+#define MD_DISK_ROLE_SPARE 0xffff
+#define MD_DISK_ROLE_FAULTY 0xfffe
+#define MD_DISK_ROLE_MAX 0xff00 /* max value of regular disk role */
+
typedef struct mdp_device_descriptor_s {
__u32 number; /* 0 Device number in the entire set */
__u32 major; /* 1 Device major number */
diff --git a/super1.c b/super1.c
index 6f42291..b881eb5 100644
--- a/super1.c
+++ b/super1.c
@@ -465,13 +465,13 @@ static void examine_super1(struct supertype *st, char *homehost)
/* This turns out to just be confusing */
printf(" Array Slot : %d (", __le32_to_cpu(sb->dev_number));
for (i= __le32_to_cpu(sb->max_dev); i> 0 ; i--)
- if (__le16_to_cpu(sb->dev_roles[i-1]) != 0xffff)
+ if (__le16_to_cpu(sb->dev_roles[i-1]) != MD_DISK_ROLE_SPARE)
break;
for (d=0; d < i; d++) {
int role = __le16_to_cpu(sb->dev_roles[d]);
if (d) printf(", ");
- if (role == 0xffff) printf("empty");
- else if(role == 0xfffe) printf("failed");
+ if (role == MD_DISK_ROLE_SPARE) printf("empty");
+ else if(role == MD_DISK_ROLE_FAULTY) printf("failed");
else printf("%d", role);
}
printf(")\n");
@@ -481,8 +481,8 @@ static void examine_super1(struct supertype *st, char *homehost)
if (d < __le32_to_cpu(sb->max_dev))
role = __le16_to_cpu(sb->dev_roles[d]);
else
- role = 0xFFFF;
- if (role >= 0xFFFE)
+ role = MD_DISK_ROLE_SPARE;
+ if (role >= MD_DISK_ROLE_FAULTY)
printf("spare\n");
else if (sb->feature_map & __cpu_to_le32(MD_FEATURE_REPLACEMENT))
printf("Replacement device %d\n", role);
@@ -512,7 +512,7 @@ static void examine_super1(struct supertype *st, char *homehost)
faulty = 0;
for (i=0; i< __le32_to_cpu(sb->max_dev); i++) {
int role = __le16_to_cpu(sb->dev_roles[i]);
- if (role == 0xFFFE)
+ if (role == MD_DISK_ROLE_FAULTY)
faulty++;
}
if (faulty) printf(" %d failed", faulty);
@@ -922,7 +922,7 @@ static void getinfo_super1(struct supertype *st, struct mdinfo *info, char *map)
info->disk.number = __le32_to_cpu(sb->dev_number);
if (__le32_to_cpu(sb->dev_number) >= __le32_to_cpu(sb->max_dev) ||
__le32_to_cpu(sb->dev_number) >= MAX_DEVS)
- role = 0xfffe;
+ role = MD_DISK_ROLE_FAULTY;
else
role = __le16_to_cpu(sb->dev_roles[__le32_to_cpu(sb->dev_number)]);
@@ -989,10 +989,10 @@ static void getinfo_super1(struct supertype *st, struct mdinfo *info, char *map)
info->disk.raid_disk = -1;
switch(role) {
- case 0xFFFF:
+ case MD_DISK_ROLE_SPARE:
info->disk.state = 0; /* spare: not active, not sync, not faulty */
break;
- case 0xFFFE:
+ case MD_DISK_ROLE_FAULTY:
info->disk.state = 1; /* faulty */
break;
default:
@@ -1042,7 +1042,7 @@ static void getinfo_super1(struct supertype *st, struct mdinfo *info, char *map)
map[i] = 0;
for (i = 0; i < __le32_to_cpu(sb->max_dev); i++) {
role = __le16_to_cpu(sb->dev_roles[i]);
- if (/*role == 0xFFFF || */role < (unsigned) info->array.raid_disks) {
+ if (/*role == MD_DISK_ROLE_SPARE || */role < (unsigned) info->array.raid_disks) {
working++;
if (map && role < map_disks)
map[role] = 1;
@@ -1115,7 +1115,7 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
if (info->disk.state & (1<<MD_DISK_ACTIVE))
want = info->disk.raid_disk;
else
- want = 0xFFFF;
+ want = MD_DISK_ROLE_SPARE;
if (sb->dev_roles[d] != __cpu_to_le16(want)) {
sb->dev_roles[d] = __cpu_to_le16(want);
rv = 1;
@@ -1140,7 +1140,7 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
unsigned int max = __le32_to_cpu(sb->max_dev);
for (i=0 ; i < max ; i++)
- if (__le16_to_cpu(sb->dev_roles[i]) >= 0xfffe)
+ if (__le16_to_cpu(sb->dev_roles[i]) >= MD_DISK_ROLE_FAULTY)
break;
sb->dev_number = __cpu_to_le32(i);
info->disk.number = i;
@@ -1439,9 +1439,9 @@ static int add_to_super1(struct supertype *st, mdu_disk_info_t *dk,
if ((dk->state & 6) == 6) /* active, sync */
*rp = __cpu_to_le16(dk->raid_disk);
else if ((dk->state & ~2) == 0) /* active or idle -> spare */
- *rp = 0xffff;
+ *rp = MD_DISK_ROLE_SPARE;
else
- *rp = 0xfffe;
+ *rp = MD_DISK_ROLE_FAULTY;
if (dk->number >= (int)__le32_to_cpu(sb->max_dev) &&
__le32_to_cpu(sb->max_dev) < MAX_DEVS)
@@ -2445,7 +2445,7 @@ void *super1_make_v0(struct supertype *st, struct mdinfo *info, mdp_super_t *sb0
for (i = 0; i < MD_SB_DISKS; i++) {
int state = sb0->disks[i].state;
- sb->dev_roles[i] = 0xFFFF;
+ sb->dev_roles[i] = MD_DISK_ROLE_SPARE;
if ((state & (1<<MD_DISK_SYNC)) &&
!(state & (1<<MD_DISK_FAULTY)))
sb->dev_roles[i] = __cpu_to_le16(sb0->disks[i].raid_disk);
--
2.4.6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/6] Show device as journal in --detail --examine
2015-10-09 5:51 [PATCH v2 0/6] mdadm support for journal device of RAID-4/5/6 Song Liu
2015-10-09 5:51 ` [PATCH v2 1/6] add macros for MD_DISK_ROLE_(SPARE/FAULTY) Song Liu
@ 2015-10-09 5:51 ` Song Liu
2015-10-09 5:51 ` [PATCH v2 3/6] Enable create array with write journal (--write-journal DEVICE) Song Liu
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Song Liu @ 2015-10-09 5:51 UTC (permalink / raw)
To: linux-raid; +Cc: neilb, shli, hch, dan.j.williams, kernel-team, Song Liu
Example output:
./mdadm --detail /dev/md127
/dev/md127:
Version : 1.2
Creation Time : Wed May 13 17:01:12 2015
Raid Level : raid5
Array Size : 11720662464 (11177.69 GiB 12001.96 GB)
Used Dev Size : 3906887488 (3725.90 GiB 4000.65 GB)
Raid Devices : 4
Total Devices : 5
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Wed May 13 17:01:12 2015
State : clean
Active Devices : 4
Working Devices : 5
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 32K
Name : 0
UUID : 8fb9ee05:3831d52f:e5c23825:28cd6881
Events : 0
Number Major Minor RaidDevice State
0 8 32 0 active sync /dev/sdc
1 8 48 1 active sync /dev/sdd
2 8 64 2 active sync /dev/sde
3 8 80 3 active sync /dev/sdf
4 8 17 - journal /dev/sdb1
./mdadm -E /dev/sdb2
/dev/sdb2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x201
Array UUID : 562b2334:35b9bcc1:add50892:1f30c4bd
Name : 0
Creation Time : Thu Aug 27 12:55:26 2015
Raid Level : raid5
Raid Devices : 15
Avail Dev Size : 249796608 (119.11 GiB 127.90 GB)
Array Size : 54696423936 (52162.57 GiB 56009.14 GB)
Used Dev Size : 7813774848 (3725.90 GiB 4000.65 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262056 sectors, after=0 sectors
State : active
Device UUID : 5015e522:d39ba566:5909cf3c:9c51f2ff
Internal Bitmap : 8 sectors from superblock
Update Time : Thu Aug 27 13:16:55 2015
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 4e6fd76d - correct
Events : 262
Layout : left-symmetric
Chunk Size : 256K
Device Role : Journal
Array State : AAAAAAAAAAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
---
Detail.c | 3 ++-
md_p.h | 2 ++
super1.c | 9 +++++++++
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/Detail.c b/Detail.c
index dd72ede..200f65f 100644
--- a/Detail.c
+++ b/Detail.c
@@ -650,9 +650,10 @@ This is pretty boring
}
if (disk.state & (1<<MD_DISK_REMOVED)) printf(" removed");
if (disk.state & (1<<MD_DISK_WRITEMOSTLY)) printf(" writemostly");
+ if (disk.state & (1<<MD_DISK_JOURNAL)) printf(" journal");
if ((disk.state &
((1<<MD_DISK_ACTIVE)|(1<<MD_DISK_SYNC)
- |(1<<MD_DISK_REMOVED)|(1<<MD_DISK_FAULTY)))
+ |(1<<MD_DISK_REMOVED)|(1<<MD_DISK_FAULTY)|(1<<MD_DISK_JOURNAL)))
== 0) {
printf(" spare");
if (is_26) {
diff --git a/md_p.h b/md_p.h
index 3a3b8af..fae73ba 100644
--- a/md_p.h
+++ b/md_p.h
@@ -91,9 +91,11 @@
*/
#define MD_DISK_REPLACEMENT 17
+#define MD_DISK_JOURNAL 18 /* disk is used as the write journal in RAID-5/6 */
#define MD_DISK_ROLE_SPARE 0xffff
#define MD_DISK_ROLE_FAULTY 0xfffe
+#define MD_DISK_ROLE_JOURNAL 0xfffd
#define MD_DISK_ROLE_MAX 0xff00 /* max value of regular disk role */
typedef struct mdp_device_descriptor_s {
diff --git a/super1.c b/super1.c
index b881eb5..6905b6d 100644
--- a/super1.c
+++ b/super1.c
@@ -126,6 +126,7 @@ struct misc_dev_info {
*/
#define MD_FEATURE_NEW_OFFSET 64 /* new_offset must be honoured */
#define MD_FEATURE_BITMAP_VERSIONED 256 /* bitmap version number checked properly */
+#define MD_FEATURE_JOURNAL 512 /* support write journal */
#define MD_FEATURE_ALL (MD_FEATURE_BITMAP_OFFSET \
|MD_FEATURE_RECOVERY_OFFSET \
|MD_FEATURE_RESHAPE_ACTIVE \
@@ -134,6 +135,7 @@ struct misc_dev_info {
|MD_FEATURE_RESHAPE_BACKWARDS \
|MD_FEATURE_NEW_OFFSET \
|MD_FEATURE_BITMAP_VERSIONED \
+ |MD_FEATURE_JOURNAL \
)
/* return how many bytes are needed for bitmap, for cluster-md each node
@@ -484,6 +486,8 @@ static void examine_super1(struct supertype *st, char *homehost)
role = MD_DISK_ROLE_SPARE;
if (role >= MD_DISK_ROLE_FAULTY)
printf("spare\n");
+ else if (role == MD_DISK_ROLE_JOURNAL)
+ printf("Journal\n");
else if (sb->feature_map & __cpu_to_le32(MD_FEATURE_REPLACEMENT))
printf("Replacement device %d\n", role);
else
@@ -995,6 +999,11 @@ static void getinfo_super1(struct supertype *st, struct mdinfo *info, char *map)
case MD_DISK_ROLE_FAULTY:
info->disk.state = 1; /* faulty */
break;
+ case MD_DISK_ROLE_JOURNAL:
+ info->disk.state = (1 << MD_DISK_JOURNAL);
+ info->disk.raid_disk = role;
+ info->space_after = (misc->device_size - info->data_offset) % 8; /* journal uses all 4kB blocks*/
+ break;
default:
info->disk.state = 6; /* active and in sync */
info->disk.raid_disk = role;
--
2.4.6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 3/6] Enable create array with write journal (--write-journal DEVICE).
2015-10-09 5:51 [PATCH v2 0/6] mdadm support for journal device of RAID-4/5/6 Song Liu
2015-10-09 5:51 ` [PATCH v2 1/6] add macros for MD_DISK_ROLE_(SPARE/FAULTY) Song Liu
2015-10-09 5:51 ` [PATCH v2 2/6] Show device as journal in --detail --examine Song Liu
@ 2015-10-09 5:51 ` Song Liu
2015-10-09 5:51 ` [PATCH v2 4/6] Assemble array with write journal Song Liu
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Song Liu @ 2015-10-09 5:51 UTC (permalink / raw)
To: linux-raid; +Cc: neilb, shli, hch, dan.j.williams, kernel-team, Song Liu
Specify the write journal device with --write-journal DEVICE
./mdadm --create -f /dev/md0 --assume-clean -c 32 --raid-devices=4 --level=5 /dev/sd[c-f] --write-journal /dev/sdb1
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
Only one journal device is allowed. If multiple --write-journal
are given, mdadm will use the first and ignore others
./mdadm --create -f /dev/md0 --assume-clean -c 32 --raid-devices=4 --level=5 /dev/sd[c-f] --write-journal /dev/sdb1 --write-journal /dev/sdx
mdadm: Please specify only one journal device for the array.
mdadm: Ignoring --write-journal /dev/sdx...
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
---
Create.c | 20 +++++++++++++------
ReadMe.c | 1 +
md_p.h | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++
mdadm.c | 23 +++++++++++++++++++++
mdadm.h | 2 ++
super1.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
6 files changed, 167 insertions(+), 7 deletions(-)
diff --git a/Create.c b/Create.c
index b200d97..21d1374 100644
--- a/Create.c
+++ b/Create.c
@@ -87,7 +87,7 @@ int Create(struct supertype *st, char *mddev,
unsigned long long minsize=0, maxsize=0;
char *mindisc = NULL;
char *maxdisc = NULL;
- int dnum;
+ int dnum, raid_disk_num;
struct mddev_dev *dv;
int fail=0, warn=0;
struct stat stb;
@@ -182,11 +182,11 @@ int Create(struct supertype *st, char *mddev,
pr_err("This metadata type does not support spare disks at create time\n");
return 1;
}
- if (subdevs > s->raiddisks+s->sparedisks) {
+ if (subdevs > s->raiddisks+s->sparedisks+s->journaldisks) {
pr_err("You have listed more devices (%d) than are in the array(%d)!\n", subdevs, s->raiddisks+s->sparedisks);
return 1;
}
- if (!have_container && subdevs < s->raiddisks+s->sparedisks) {
+ if (!have_container && subdevs < s->raiddisks+s->sparedisks+s->journaldisks) {
pr_err("You haven't given enough devices (real or missing) to create this array\n");
return 1;
}
@@ -399,6 +399,9 @@ int Create(struct supertype *st, char *mddev,
}
}
+ if (dv->disposition == 'j')
+ continue; /* skip write journal for size check */
+
freesize /= 2; /* convert to K */
if (s->chunk && s->chunk != UnSet) {
/* round to chunk size */
@@ -839,7 +842,7 @@ int Create(struct supertype *st, char *mddev,
for (pass=1; pass <=2 ; pass++) {
struct mddev_dev *moved_disk = NULL; /* the disk that was moved out of the insert point */
- for (dnum=0, dv = devlist ; dv ;
+ for (dnum=0, raid_disk_num=0, dv = devlist ; dv ;
dv=(dv->next)?(dv->next):moved_disk, dnum++) {
int fd;
struct stat stb;
@@ -864,8 +867,13 @@ int Create(struct supertype *st, char *mddev,
*inf = info;
inf->disk.number = dnum;
- inf->disk.raid_disk = dnum;
- if (inf->disk.raid_disk < s->raiddisks)
+ inf->disk.raid_disk = raid_disk_num++;
+
+ if (dv->disposition == 'j') {
+ inf->disk.raid_disk = MD_DISK_ROLE_JOURNAL;
+ inf->disk.state = (1<<MD_DISK_JOURNAL);
+ raid_disk_num--;
+ } else if (inf->disk.raid_disk < s->raiddisks)
inf->disk.state = (1<<MD_DISK_ACTIVE) |
(1<<MD_DISK_SYNC);
else
diff --git a/ReadMe.c b/ReadMe.c
index c242319..10921e3 100644
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -142,6 +142,7 @@ struct option long_options[] = {
{"data-offset",1, 0, DataOffset},
{"nodes",1, 0, Nodes}, /* also for --assemble */
{"home-cluster",1, 0, ClusterName},
+ {"write-journal",1, 0, WriteJournal},
/* For assemble */
{"uuid", 1, 0, 'u'},
diff --git a/md_p.h b/md_p.h
index fae73ba..0d691fb 100644
--- a/md_p.h
+++ b/md_p.h
@@ -208,4 +208,62 @@ static inline __u64 md_event(mdp_super_t *sb) {
return (ev<<32)| sb->events_lo;
}
+struct r5l_payload_header {
+ __u16 type;
+ __u16 flags;
+} __attribute__ ((__packed__));
+
+enum r5l_payload_type {
+ R5LOG_PAYLOAD_DATA = 0,
+ R5LOG_PAYLOAD_PARITY = 1,
+ R5LOG_PAYLOAD_FLUSH = 2,
+};
+
+struct r5l_payload_data_parity {
+ struct r5l_payload_header header;
+ __u32 size; /* sector. data/parity size. each 4k has a checksum */
+ __u64 location; /* sector. For data, it's raid sector. For
+ parity, it's stripe sector */
+ __u32 checksum[];
+} __attribute__ ((__packed__));
+
+enum r5l_payload_data_parity_flag {
+ R5LOG_PAYLOAD_FLAG_DISCARD = 1, /* payload is discard */
+ /*
+ * RESHAPED/RESHAPING is only set when there is reshape activity. Note,
+ * both data/parity of a stripe should have the same flag set
+ *
+ * RESHAPED: reshape is running, and this stripe finished reshape
+ * RESHAPING: reshape is running, and this stripe isn't reshaped
+ * */
+ R5LOG_PAYLOAD_FLAG_RESHAPED = 2,
+ R5LOG_PAYLOAD_FLAG_RESHAPING = 3,
+};
+
+struct r5l_payload_flush {
+ struct r5l_payload_header header;
+ __u32 size; /* flush_stripes size, bytes */
+ __u64 flush_stripes[];
+} __attribute__ ((__packed__));
+
+enum r5l_payload_flush_flag {
+ R5LOG_PAYLOAD_FLAG_FLUSH_STRIPE = 1, /* data represents whole stripe */
+};
+
+struct r5l_meta_block {
+ __u32 magic;
+ __u32 checksum;
+ __u8 version;
+ __u8 __zero_pading_1;
+ __u16 __zero_pading_2;
+ __u32 meta_size; /* whole size of the block */
+
+ __u64 seq;
+ __u64 position; /* sector, start from rdev->data_offset, current position */
+ struct r5l_payload_header payloads[];
+} __attribute__ ((__packed__));
+
+#define R5LOG_VERSION 0x1
+#define R5LOG_MAGIC 0x6433c509
+
#endif
diff --git a/mdadm.c b/mdadm.c
index 183f6c8..f32a3d4 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -74,6 +74,7 @@ int main(int argc, char *argv[])
.require_homehost = 1,
};
struct shape s = {
+ .journaldisks = 0,
.level = UnSet,
.layout = UnSet,
.bitmap_chunk = UnSet,
@@ -1170,6 +1171,23 @@ int main(int argc, char *argv[])
case O(INCREMENTAL, IncrementalPath):
remove_path = optarg;
continue;
+ case O(CREATE, WriteJournal):
+ if (s.journaldisks) {
+ pr_err("Please specify only one journal device for the array.\n");
+ pr_err("Ignoring --write-journal %s...\n", optarg);
+ continue;
+ }
+ dv = xmalloc(sizeof(*dv));
+ dv->devname = optarg;
+ dv->disposition = 'j'; /* WriteJournal */
+ dv->used = 0;
+ dv->next = NULL;
+ *devlistend = dv;
+ devlistend = &dv->next;
+ devs_found++;
+
+ s.journaldisks = 1;
+ continue;
}
/* We have now processed all the valid options. Anything else is
* an error
@@ -1197,6 +1215,11 @@ int main(int argc, char *argv[])
exit(0);
}
+ if (s.journaldisks && (s.level < 4 || s.level > 6)) {
+ pr_err("--write-journal is only supported for RAID level 4/5/6.\n");
+ exit(2);
+ }
+
if (!mode && devs_found) {
mode = MISC;
devmode = 'Q';
diff --git a/mdadm.h b/mdadm.h
index 5633663..0b27b43 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -347,6 +347,7 @@ enum special_options {
Nodes,
ClusterName,
ClusterConfirm,
+ WriteJournal,
};
enum prefix_standard {
@@ -434,6 +435,7 @@ struct context {
struct shape {
int raiddisks;
int sparedisks;
+ int journaldisks;
int level;
int layout;
char *layout_str;
diff --git a/super1.c b/super1.c
index 6905b6d..85e3b28 100644
--- a/super1.c
+++ b/super1.c
@@ -68,7 +68,10 @@ struct mdp_superblock_1 {
__u64 data_offset; /* sector start of data, often 0 */
__u64 data_size; /* sectors in this device that can be used for data */
__u64 super_offset; /* sector start of this superblock */
- __u64 recovery_offset;/* sectors before this offset (from data_offset) have been recovered */
+ union {
+ __u64 recovery_offset;/* sectors before this offset (from data_offset) have been recovered */
+ __u64 journal_tail;/* journal tail of journal device (from data_offset) */
+ };
__u32 dev_number; /* permanent identifier of this device - not role in raid */
__u32 cnt_corrected_read; /* number of read errors that were corrected by re-writing */
__u8 device_uuid[16]; /* user-space setable, ignored by kernel */
@@ -1447,6 +1450,8 @@ static int add_to_super1(struct supertype *st, mdu_disk_info_t *dk,
if ((dk->state & 6) == 6) /* active, sync */
*rp = __cpu_to_le16(dk->raid_disk);
+ else if (dk->state & (1<<MD_DISK_JOURNAL))
+ *rp = MD_DISK_ROLE_JOURNAL;
else if ((dk->state & ~2) == 0) /* active or idle -> spare */
*rp = MD_DISK_ROLE_SPARE;
else
@@ -1566,6 +1571,57 @@ static unsigned long choose_bm_space(unsigned long devsize)
static void free_super1(struct supertype *st);
+#define META_BLOCK_SIZE 4096
+unsigned long crc32(
+ unsigned long crc,
+ const unsigned char *buf,
+ unsigned len);
+
+static int write_empty_r5l_meta_block(struct supertype *st, int fd)
+{
+ struct r5l_meta_block *mb;
+ struct mdp_superblock_1 *sb = st->sb;
+ struct align_fd afd;
+ __u32 crc;
+
+ init_afd(&afd, fd);
+
+ if (posix_memalign((void**)&mb, 4096, META_BLOCK_SIZE) != 0) {
+ pr_err("Could not allocate memory for the meta block.\n");
+ return 1;
+ }
+
+ memset(mb, 0, META_BLOCK_SIZE);
+
+ mb->magic = __cpu_to_le32(R5LOG_MAGIC);
+ mb->version = R5LOG_VERSION;
+ mb->meta_size = __cpu_to_le32(sizeof(struct r5l_meta_block));
+ mb->seq = __cpu_to_le64(random32());
+ mb->position = __cpu_to_le64(0);
+
+ crc = crc32(0xffffffff, sb->set_uuid, sizeof(sb->set_uuid));
+ crc = crc32(crc, (void *)mb, META_BLOCK_SIZE);
+ mb->checksum = __cpu_to_le32(crc);
+
+ if (lseek64(fd, (sb->data_offset) * 512, 0) < 0LL) {
+ pr_err("cannot seek to offset of the meta block\n");
+ goto fail_to_write;
+ }
+
+ if (awrite(&afd, mb, META_BLOCK_SIZE) != META_BLOCK_SIZE) {
+ pr_err("failed to store write the meta block \n");
+ goto fail_to_write;
+ }
+ fsync(fd);
+
+ free(mb);
+ return 0;
+
+fail_to_write:
+ free(mb);
+ return 1;
+}
+
#ifndef MDASSEMBLE
static int write_init_super1(struct supertype *st)
{
@@ -1580,6 +1636,11 @@ static int write_init_super1(struct supertype *st)
unsigned long long data_offset;
for (di = st->info; di; di = di->next) {
+ if (di->disk.state & (1 << MD_DISK_JOURNAL))
+ sb->feature_map |= MD_FEATURE_JOURNAL;
+ }
+
+ for (di = st->info; di; di = di->next) {
if (di->disk.state & (1 << MD_DISK_FAULTY))
continue;
if (di->fd < 0)
@@ -1718,6 +1779,13 @@ static int write_init_super1(struct supertype *st)
sb->sb_csum = calc_sb_1_csum(sb);
rv = store_super1(st, di->fd);
+
+ if (rv == 0 && (di->disk.state & (1 << MD_DISK_JOURNAL))) {
+ rv = write_empty_r5l_meta_block(st, di->fd);
+ if (rv)
+ goto error_out;
+ }
+
if (rv == 0 && (__le32_to_cpu(sb->feature_map) & 1))
rv = st->ss->write_bitmap(st, di->fd, NoUpdate);
close(di->fd);
--
2.4.6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 4/6] Assemble array with write journal
2015-10-09 5:51 [PATCH v2 0/6] mdadm support for journal device of RAID-4/5/6 Song Liu
` (2 preceding siblings ...)
2015-10-09 5:51 ` [PATCH v2 3/6] Enable create array with write journal (--write-journal DEVICE) Song Liu
@ 2015-10-09 5:51 ` Song Liu
2015-10-09 5:51 ` [PATCH v2 5/6] Check write journal in incremental Song Liu
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Song Liu @ 2015-10-09 5:51 UTC (permalink / raw)
To: linux-raid; +Cc: neilb, shli, hch, dan.j.williams, kernel-team, Song Liu
Example output:
./mdadm --assemble /dev/md0 /dev/sd[c-f] /dev/sdb1
mdadm: /dev/md0 has been started with 4 drives and 1 journal.
mdadm checks superblock for journal devices. If the journal device
is missing or faulty, mdadm will show warning
./mdadm --assemble /dev/md0 /dev/sd[c-q] /dev/sdb1
mdadm: Not safe to assemble with missing or stale journal device, consider --force.
User can insist to start the array (read only) with --force
./mdadm --assemble /dev/md0 /dev/sd[c-q] /dev/sdb1 --force
mdadm: Journal is missing or stale, starting array read only.
mdadm: /dev/md0 has been started with 15 drives.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
---
Assemble.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++-----------
mdadm.h | 3 +++
super1.c | 37 ++++++++++++++++++++++++++++++++-----
3 files changed, 81 insertions(+), 16 deletions(-)
diff --git a/Assemble.c b/Assemble.c
index d9e9001..0661e8d 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -735,7 +735,7 @@ static int load_devices(struct devs *devices, char *devmap,
i = devcnt;
else
i = devices[devcnt].i.disk.raid_disk;
- if (i+1 == 0) {
+ if (i+1 == 0 || i == MD_DISK_ROLE_JOURNAL) {
if (nextspare < content->array.raid_disks*2)
nextspare = content->array.raid_disks*2;
i = nextspare++;
@@ -913,7 +913,6 @@ static int force_array(struct mdinfo *content,
avail[chosen_drive] = 1;
okcnt++;
tst->ss->free_super(tst);
-
/* If there are any other drives of the same vintage,
* add them in as well. We can't lose and we might gain
*/
@@ -944,17 +943,29 @@ static int start_array(int mdfd,
unsigned int okcnt,
unsigned int sparecnt,
unsigned int rebuilding_cnt,
+ unsigned int journalcnt,
struct context *c,
int clean, char *avail,
int start_partial_ok,
int err_ok,
- int was_forced
+ int was_forced,
+ int expect_journal,
+ int journal_clean
)
{
int rv;
int i;
unsigned int req_cnt;
+ if (expect_journal && (journal_clean == 0)) {
+ if (!c->force) {
+ pr_err("Not safe to assemble with missing or stale journal device, consider --force.\n");
+ return 1;
+ }
+ pr_err("Journal is missing or stale, starting array read only.\n");
+ c->readonly = 1;
+ }
+
rv = set_array_info(mdfd, st, content);
if (rv && !err_ok) {
pr_err("failed to set array info for %s: %s\n",
@@ -1032,7 +1043,8 @@ static int start_array(int mdfd,
if (content->array.level == LEVEL_CONTAINER) {
if (c->verbose >= 0) {
pr_err("Container %s has been assembled with %d drive%s",
- mddev, okcnt+sparecnt, okcnt+sparecnt==1?"":"s");
+ mddev, okcnt+sparecnt+journalcnt,
+ okcnt+sparecnt+journalcnt==1?"":"s");
if (okcnt < (unsigned)content->array.raid_disks)
fprintf(stderr, " (out of %d)",
content->array.raid_disks);
@@ -1118,6 +1130,8 @@ static int start_array(int mdfd,
fprintf(stderr, "%s %d rebuilding", sparecnt?",":" and", rebuilding_cnt);
if (sparecnt)
fprintf(stderr, " and %d spare%s", sparecnt, sparecnt==1?"":"s");
+ if (journal_clean)
+ fprintf(stderr, " and %d journal", journalcnt);
fprintf(stderr, ".\n");
}
if (content->reshape_active &&
@@ -1289,10 +1303,12 @@ int Assemble(struct supertype *st, char *mddev,
int *best = NULL; /* indexed by raid_disk */
int bestcnt = 0;
int devcnt;
- unsigned int okcnt, sparecnt, rebuilding_cnt, replcnt;
+ unsigned int okcnt, sparecnt, rebuilding_cnt, replcnt, journalcnt;
int i;
int was_forced = 0;
int most_recent = 0;
+ int expect_journal = 0;
+ int journal_clean = 0;
int chosen_drive;
int change = 0;
int inargv = 0;
@@ -1355,6 +1371,14 @@ try_again:
if (!st || !st->sb || !content)
return 2;
+ if (st->ss->require_journal) {
+ expect_journal = st->ss->require_journal(st);
+ if (expect_journal == 2) {
+ pr_err("BUG: Superblock not loaded in Assemble.c:Assemble\n");
+ return 1;
+ }
+ }
+
/* We have a full set of devices - we now need to find the
* array device.
* However there is a risk that we are racing with "mdadm -I"
@@ -1530,6 +1554,7 @@ try_again:
okcnt = 0;
replcnt = 0;
sparecnt=0;
+ journalcnt=0;
rebuilding_cnt=0;
for (i=0; i< bestcnt; i++) {
int j = best[i];
@@ -1540,8 +1565,13 @@ try_again:
/* note: we ignore error flags in multipath arrays
* as they don't make sense
*/
- if (content->array.level != LEVEL_MULTIPATH)
- if (!(devices[j].i.disk.state & (1<<MD_DISK_ACTIVE))) {
+ if (content->array.level != LEVEL_MULTIPATH) {
+ if (devices[j].i.disk.state & (1<<MD_DISK_JOURNAL)) {
+ if (expect_journal)
+ journalcnt++;
+ else /* unexpected journal, mark as faulty */
+ devices[j].i.disk.state |= (1<<MD_DISK_FAULTY);
+ } else if (!(devices[j].i.disk.state & (1<<MD_DISK_ACTIVE))) {
if (!(devices[j].i.disk.state
& (1<<MD_DISK_FAULTY))) {
devices[j].uptodate = 1;
@@ -1549,6 +1579,7 @@ try_again:
}
continue;
}
+ }
/* If this device thinks that 'most_recent' has failed, then
* we must reject this device.
*/
@@ -1572,6 +1603,8 @@ try_again:
devices[most_recent].i.events
) {
devices[j].uptodate = 1;
+ if (devices[j].i.disk.state & (1<<MD_DISK_JOURNAL))
+ journal_clean = 1;
if (i < content->array.raid_disks * 2) {
if (devices[j].i.recovery_start == MaxSector ||
(content->reshape_active &&
@@ -1583,7 +1616,7 @@ try_again:
replcnt++;
} else
rebuilding_cnt++;
- } else
+ } else if (devices[j].i.disk.raid_disk != MD_DISK_ROLE_JOURNAL)
sparecnt++;
}
}
@@ -1647,7 +1680,9 @@ try_again:
int j = best[i];
unsigned int desired_state;
- if (i >= content->array.raid_disks * 2)
+ if (devices[j].i.disk.raid_disk == MD_DISK_ROLE_JOURNAL)
+ desired_state = (1<<MD_DISK_JOURNAL);
+ else if (i >= content->array.raid_disks * 2)
desired_state = 0;
else if (i & 1)
desired_state = (1<<MD_DISK_ACTIVE) | (1<<MD_DISK_REPLACEMENT);
@@ -1794,11 +1829,11 @@ try_again:
rv = start_array(mdfd, mddev, content,
st, ident, best, bestcnt,
chosen_drive, devices, okcnt, sparecnt,
- rebuilding_cnt,
+ rebuilding_cnt, journalcnt,
c,
clean, avail, start_partial_ok,
pre_exist != NULL,
- was_forced);
+ was_forced, expect_journal, journal_clean);
if (rv == 1 && !pre_exist)
ioctl(mdfd, STOP_ARRAY, NULL);
free(devices);
diff --git a/mdadm.h b/mdadm.h
index 0b27b43..b1028be 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -970,6 +970,9 @@ extern struct superswitch {
/* validate container after assemble */
int (*validate_container)(struct mdinfo *info);
+ /* whether the array require a journal device */
+ int (*require_journal)(struct supertype *st);
+
int swapuuid; /* true if uuid is bigending rather than hostendian */
int external;
const char *name; /* canonical metadata name */
diff --git a/super1.c b/super1.c
index 85e3b28..47acdec 100644
--- a/super1.c
+++ b/super1.c
@@ -140,6 +140,34 @@ struct misc_dev_info {
|MD_FEATURE_BITMAP_VERSIONED \
|MD_FEATURE_JOURNAL \
)
+/* return value:
+ * 0, jouranl not required
+ * 1, journal required
+ * 2, no superblock loated (st->sb == NULL)
+ */
+static int require_journal1(struct supertype *st)
+{
+ struct mdp_superblock_1 *sb = st->sb;
+
+ if (sb->feature_map & MD_FEATURE_JOURNAL)
+ return 1;
+ else if (!sb)
+ return 2; /* no sb loaded */
+ return 0;
+}
+
+static int role_from_sb(struct mdp_superblock_1 *sb)
+{
+ unsigned int d;
+ int role;
+
+ d = __le32_to_cpu(sb->dev_number);
+ if (d < __le32_to_cpu(sb->max_dev))
+ role = __le16_to_cpu(sb->dev_roles[d]);
+ else
+ role = MD_DISK_ROLE_SPARE;
+ return role;
+}
/* return how many bytes are needed for bitmap, for cluster-md each node
* should have it's own bitmap */
@@ -482,11 +510,7 @@ static void examine_super1(struct supertype *st, char *homehost)
printf(")\n");
#endif
printf(" Device Role : ");
- d = __le32_to_cpu(sb->dev_number);
- if (d < __le32_to_cpu(sb->max_dev))
- role = __le16_to_cpu(sb->dev_roles[d]);
- else
- role = MD_DISK_ROLE_SPARE;
+ role = role_from_sb(sb);
if (role >= MD_DISK_ROLE_FAULTY)
printf("spare\n");
else if (role == MD_DISK_ROLE_JOURNAL)
@@ -1126,6 +1150,8 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
int want;
if (info->disk.state & (1<<MD_DISK_ACTIVE))
want = info->disk.raid_disk;
+ else if (info->disk.state & (1<<MD_DISK_JOURNAL))
+ want = MD_DISK_ROLE_JOURNAL;
else
want = MD_DISK_ROLE_SPARE;
if (sb->dev_roles[d] != __cpu_to_le16(want)) {
@@ -2560,6 +2586,7 @@ struct superswitch super1 = {
.locate_bitmap = locate_bitmap1,
.write_bitmap = write_bitmap1,
.free_super = free_super1,
+ .require_journal = require_journal1,
#if __BYTE_ORDER == BIG_ENDIAN
.swapuuid = 0,
#else
--
2.4.6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 5/6] Check write journal in incremental
2015-10-09 5:51 [PATCH v2 0/6] mdadm support for journal device of RAID-4/5/6 Song Liu
` (3 preceding siblings ...)
2015-10-09 5:51 ` [PATCH v2 4/6] Assemble array with write journal Song Liu
@ 2015-10-09 5:51 ` Song Liu
2015-10-09 5:51 ` [PATCH v2 6/6] Add help message and man entry for --write-journal Song Liu
[not found] ` <C709E4D363AAB64590BFAC54D4C478AA0104B94E0A@PRN-MBX02-4.TheFacebook.com>
6 siblings, 0 replies; 8+ messages in thread
From: Song Liu @ 2015-10-09 5:51 UTC (permalink / raw)
To: linux-raid; +Cc: neilb, shli, hch, dan.j.williams, kernel-team, Song Liu
If journal device is missing, do not start the array, and shows:
./mdadm -I /dev/sdf
mdadm: journal device is missing, not safe to start yet.
The array will be started when the journal device is attached with -I
./mdadm -I /dev/sdb1
mdadm: /dev/sdb1 attached to /dev/md/0_0, which has been started.
To force start without journal device:
./mdadm -I /dev/sdf --run
mdadm: Trying to run with missing journal device
mdadm: /dev/sdf attached to /dev/md/0_0, which has been started.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
---
Incremental.c | 31 +++++++++++++++++++++++++++----
1 file changed, 27 insertions(+), 4 deletions(-)
diff --git a/Incremental.c b/Incremental.c
index 43fddfd..5b2974c 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -35,7 +35,7 @@
static int count_active(struct supertype *st, struct mdinfo *sra,
int mdfd, char **availp,
- struct mdinfo *info);
+ struct mdinfo *info, int *journal_device_missing);
static void find_reject(int mdfd, struct supertype *st, struct mdinfo *sra,
int number, __u64 events, int verbose,
char *array_name);
@@ -104,6 +104,7 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
struct map_ent target_array;
int have_target;
char *devname = devlist->devname;
+ int journal_device_missing = 0;
struct createinfo *ci = conf_get_create_info();
@@ -519,7 +520,7 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
sysfs_free(sra);
sra = sysfs_read(mdfd, NULL, (GET_DEVS | GET_STATE |
GET_OFFSET | GET_SIZE));
- active_disks = count_active(st, sra, mdfd, &avail, &info);
+ active_disks = count_active(st, sra, mdfd, &avail, &info, &journal_device_missing);
if (enough(info.array.level, info.array.raid_disks,
info.array.layout, info.array.state & 1,
avail) == 0) {
@@ -549,10 +550,12 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
}
map_unlock(&map);
- if (c->runstop > 0 || active_disks >= info.array.working_disks) {
+ if (c->runstop > 0 || (!journal_device_missing && active_disks >= info.array.working_disks)) {
struct mdinfo *dsk;
/* Let's try to start it */
+ if (journal_device_missing)
+ pr_err("Trying to run with missing journal device\n");
if (info.reshape_active && !(info.reshape_active & RESHAPE_NO_BACKUP)) {
pr_err("%s: This array is being reshaped and cannot be started\n",
chosen_name);
@@ -619,6 +622,8 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
} else {
if (c->export) {
printf("MD_STARTED=unsafe\n");
+ } else if (journal_device_missing) {
+ pr_err("Journal device is missing, not safe to start yet.\n");
} else if (c->verbose >= 0)
pr_err("%s attached to %s, not enough to start safely.\n",
devname, chosen_name);
@@ -685,7 +690,8 @@ static void find_reject(int mdfd, struct supertype *st, struct mdinfo *sra,
static int count_active(struct supertype *st, struct mdinfo *sra,
int mdfd, char **availp,
- struct mdinfo *bestinfo)
+ struct mdinfo *bestinfo,
+ int *journal_device_missing)
{
/* count how many devices in sra think they are active */
struct mdinfo *d;
@@ -699,6 +705,8 @@ static int count_active(struct supertype *st, struct mdinfo *sra,
int devnum;
int b, i;
int raid_disks = 0;
+ int require_journal_dev = 0;
+ int has_journal_dev = 0;
if (!sra)
return 0;
@@ -719,8 +727,19 @@ static int count_active(struct supertype *st, struct mdinfo *sra,
close(dfd);
if (ok != 0)
continue;
+
+ if (st->ss->require_journal) {
+ require_journal_dev = st->ss->require_journal(st);
+ if (require_journal_dev == 2) {
+ pr_err("BUG: Superblock not loaded in Incremental.c:count_active\n");
+ return 0;
+ }
+ }
+
info.array.raid_disks = raid_disks;
st->ss->getinfo_super(st, &info, devmap + raid_disks * devnum);
+ if (info.disk.raid_disk == MD_DISK_ROLE_JOURNAL)
+ has_journal_dev = 1;
if (!avail) {
raid_disks = info.array.raid_disks;
avail = xcalloc(raid_disks, 1);
@@ -770,6 +789,10 @@ static int count_active(struct supertype *st, struct mdinfo *sra,
replcnt++;
st->ss->free_super(st);
}
+
+ if (require_journal_dev && !has_journal_dev)
+ *journal_device_missing = 1;
+
if (!avail)
return 0;
/* We need to reject any device that thinks the best device is
--
2.4.6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 6/6] Add help message and man entry for --write-journal
2015-10-09 5:51 [PATCH v2 0/6] mdadm support for journal device of RAID-4/5/6 Song Liu
` (4 preceding siblings ...)
2015-10-09 5:51 ` [PATCH v2 5/6] Check write journal in incremental Song Liu
@ 2015-10-09 5:51 ` Song Liu
[not found] ` <C709E4D363AAB64590BFAC54D4C478AA0104B94E0A@PRN-MBX02-4.TheFacebook.com>
6 siblings, 0 replies; 8+ messages in thread
From: Song Liu @ 2015-10-09 5:51 UTC (permalink / raw)
To: linux-raid; +Cc: neilb, shli, hch, dan.j.williams, kernel-team, Song Liu
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
---
ReadMe.c | 1 +
mdadm.8.in | 6 ++++++
2 files changed, 7 insertions(+)
diff --git a/ReadMe.c b/ReadMe.c
index 10921e3..fb5a671 100644
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -376,6 +376,7 @@ char Help_create[] =
" --name= -N : Textual name for array - max 32 characters\n"
" --bitmap-chunk= : bitmap chunksize in Kilobytes.\n"
" --delay= -d : bitmap update delay in seconds.\n"
+" --write-journal= : Specify journal device for RAID-4/5/6 array\n"
"\n"
;
diff --git a/mdadm.8.in b/mdadm.8.in
index bf3e131..2844039 100644
--- a/mdadm.8.in
+++ b/mdadm.8.in
@@ -990,6 +990,12 @@ Only works when the array is for clustered environment. It specifies
the maximum number of nodes in the cluster that will use this device
simultaneously. If not specified, this defaults to 4.
+.TP
+.BR \-\-write-journal
+Specify journal device for the RAID-4/5/6 array. The journal device
+should be a SSD with reasonable lifetime.
+
+
.SH For assemble:
.TP
--
2.4.6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* RE: [PATCH v2 0/6] mdadm support for journal device of RAID-4/5/6
[not found] ` <C709E4D363AAB64590BFAC54D4C478AA0104B94E0A@PRN-MBX02-4.TheFacebook.com>
@ 2015-10-19 2:11 ` Neil Brown
0 siblings, 0 replies; 8+ messages in thread
From: Neil Brown @ 2015-10-19 2:11 UTC (permalink / raw)
To: Song Liu; +Cc: Shaohua Li, linux-raid
[-- Attachment #1: Type: text/plain, Size: 1067 bytes --]
Song Liu <songliubraving@fb.com> writes:
> Hi Neil,
>
> Could you please share some insights about kernel and mdadm patches for
> journal device in RAID-4/5/6? What shall we do next to move this approach
> ahead?
I hadn't been looking at the mdadm patches until I was fairly
comfortable with the kernel code. I think we have reached that state
now and I have just looked at your mdadm patches.
These seem quite sensible and thorough - thanks. I have applied them to
by 'master' branch.
Would you be able to write a few test scripts to go in the 'tests'
directory?
Some of your changelog comment demonstrated how some commands would work
and others would give useful error messages. If there were test scripts
which confirmed the code continues to do that (and maybe more) that
would be great.
Also a section of the md.4 man page (similar to "BITMAP WRITE-INTENT
LOGGING" and "BAD BLOCK LIST") which gave a general outline of the
purpose, value, and possible costs, of using a journal would be really
helpful.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-10-19 2:11 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-09 5:51 [PATCH v2 0/6] mdadm support for journal device of RAID-4/5/6 Song Liu
2015-10-09 5:51 ` [PATCH v2 1/6] add macros for MD_DISK_ROLE_(SPARE/FAULTY) Song Liu
2015-10-09 5:51 ` [PATCH v2 2/6] Show device as journal in --detail --examine Song Liu
2015-10-09 5:51 ` [PATCH v2 3/6] Enable create array with write journal (--write-journal DEVICE) Song Liu
2015-10-09 5:51 ` [PATCH v2 4/6] Assemble array with write journal Song Liu
2015-10-09 5:51 ` [PATCH v2 5/6] Check write journal in incremental Song Liu
2015-10-09 5:51 ` [PATCH v2 6/6] Add help message and man entry for --write-journal Song Liu
[not found] ` <C709E4D363AAB64590BFAC54D4C478AA0104B94E0A@PRN-MBX02-4.TheFacebook.com>
2015-10-19 2:11 ` [PATCH v2 0/6] mdadm support for journal device of RAID-4/5/6 Neil Brown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).