From: NeilBrown <neilb@suse.de>
To: Piergiorgio Sartor <piergiorgio.sartor@nexgo.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: [PATCH] Re: User space RAID-6 access
Date: Tue, 8 Feb 2011 09:49:52 +1100 [thread overview]
Message-ID: <20110208094952.49745c7b@notabene.brown> (raw)
In-Reply-To: <20110207222459.GA25471@lazy.lzy>
On Mon, 7 Feb 2011 23:24:59 +0100 Piergiorgio Sartor
<piergiorgio.sartor@nexgo.de> wrote:
> > test_stripe assumes that the data starts at the start of each device.
> > AS you are using 1.2 metadata (the default), data starts about 1M in to
> > the device (I think - you can check with --examine)
> >
> > You could fix test_stripe to put the right value in the 'offsets' array,
> > or you could create the array with 1.0 or 0.90 metadata.
>
> Hi Neil,
>
> thanks for the info, maybe this should be a second patch.
>
> In the meantime, please find attached a patch to restripe.c
> of mdadm 3.2 (latest, I hope).
>
> This should add the functionality to detect, in RAID-6,
> which of the disks potentially has problems, in case of
> parity errors.
>
> Some checks take place in order to avoid false positives,
> I hope these are correct and enough.
>
> I'm not 100% happy of the interface (too much redundancy),
> but for the time being it could be OK.
>
> Of course, any improvement is welcome.
>
> Please consider to include these changes to the next mdadm
> whatever release.
>
Thanks a lot!
I have applied some patch - with some formatting changes to make it consistent
with the rest of the code.
I don't really have time to look more deeply at it at the moment.
Maybe someone else will?...
Thanks,
NeilBrown
> bye,
>
> --- restripe.c.org 2011-01-13 05:52:15.000000000 +0100
> +++ restripe.c 2011-02-07 22:51:01.539471472 +0100
> @@ -285,10 +285,13 @@
> uint8_t raid6_gfexp[256];
> uint8_t raid6_gfinv[256];
> uint8_t raid6_gfexi[256];
> +uint8_t raid6_gflog[256];
> +uint8_t raid6_gfilog[256];
> void make_tables(void)
> {
> int i, j;
> uint8_t v;
> + uint32_t b, log;
>
> /* Compute multiplication table */
> for (i = 0; i < 256; i++)
> @@ -312,6 +315,19 @@
> for (i = 0; i < 256; i ++)
> raid6_gfexi[i] = raid6_gfinv[raid6_gfexp[i] ^ 1];
>
> + /* Compute log and inverse log */
> + /* Modified code from: http://web.eecs.utk.edu/~plank/plank/papers/CS-96-332.html */
> + b = 1;
> + raid6_gflog[0] = 0;
> + raid6_gfilog[255] = 0;
> +
> + for (log = 0; log < 255; log++) {
> + raid6_gflog[b] = (uint8_t) log;
> + raid6_gfilog[log] = (uint8_t) b;
> + b = b << 1;
> + if (b & 256) b = b ^ 0435;
> + }
> +
> tables_ready = 1;
> }
>
> @@ -387,6 +403,65 @@
> }
> }
>
> +/* Try to find out if a specific disk has a problem */
> +int raid6_check_disks(int data_disks, int start, int chunk_size, int level, int layout, int diskP, int diskQ, char *p, char *q, char **stripes)
> +{
> + int i;
> + int data_id, diskD;
> + uint8_t Px, Qx;
> + int curr_broken_disk = -1;
> + int prev_broken_disk = -1;
> + int broken_status = 0;
> +
> + for(i = 0; i < chunk_size; i++) {
> + Px = (uint8_t)stripes[diskP][i] ^ (uint8_t)p[i];
> + Qx = (uint8_t)stripes[diskQ][i] ^ (uint8_t)q[i];
> +
> + if((Px != 0) && (Qx == 0)) {
> + curr_broken_disk = diskP;
> + }
> +
> + if((Px == 0) && (Qx != 0)) {
> + curr_broken_disk = diskQ;
> + }
> +
> + if((Px != 0) && (Qx != 0)) {
> + data_id = (raid6_gflog[Qx] - raid6_gflog[Px]) & 0xFF;
> + diskD = geo_map(data_id, start/chunk_size, data_disks + 2, level, layout);
> + curr_broken_disk = diskD;
> + }
> +
> + if((Px == 0) && (Qx == 0)) {
> + curr_broken_disk = curr_broken_disk;
> + }
> +
> + switch(broken_status) {
> + case 0:
> + if(curr_broken_disk != -1) {
> + prev_broken_disk = curr_broken_disk;
> + broken_status = 1;
> + }
> + break;
> +
> + case 1:
> + if(curr_broken_disk != prev_broken_disk) {
> + broken_status = 2;
> + }
> + if(curr_broken_disk >= data_disks + 2) {
> + broken_status = 2;
> + }
> + break;
> +
> + case 2:
> + default:
> + curr_broken_disk = prev_broken_disk = -2;
> + break;
> + }
> + }
> +
> + return curr_broken_disk;
> +}
> +
> /* Save data:
> * We are given:
> * A list of 'fds' of the active disks. Some may be absent.
> @@ -673,7 +748,12 @@
> char *q = malloc(chunk_size);
>
> int i;
> + int diskP, diskQ;
> int data_disks = raid_disks - (level == 5 ? 1: 2);
> +
> + if (!tables_ready)
> + make_tables();
> +
> for ( i = 0 ; i < raid_disks ; i++)
> stripes[i] = stripe_buf + i * chunk_size;
>
> @@ -693,18 +773,25 @@
> switch(level) {
> case 6:
> qsyndrome(p, q, (uint8_t**)blocks, data_disks, chunk_size);
> - disk = geo_map(-1, start/chunk_size, raid_disks,
> + diskP = geo_map(-1, start/chunk_size, raid_disks,
> level, layout);
> - if (memcmp(p, stripes[disk], chunk_size) != 0) {
> - printf("P(%d) wrong at %llu\n", disk,
> + if (memcmp(p, stripes[diskP], chunk_size) != 0) {
> + printf("P(%d) wrong at %llu\n", diskP,
> start / chunk_size);
> }
> - disk = geo_map(-2, start/chunk_size, raid_disks,
> + diskQ = geo_map(-2, start/chunk_size, raid_disks,
> level, layout);
> - if (memcmp(q, stripes[disk], chunk_size) != 0) {
> - printf("Q(%d) wrong at %llu\n", disk,
> + if (memcmp(q, stripes[diskQ], chunk_size) != 0) {
> + printf("Q(%d) wrong at %llu\n", diskQ,
> start / chunk_size);
> }
> + disk = raid6_check_disks(data_disks, start, chunk_size, level, layout, diskP, diskQ, p, q, stripes);
> + if(disk >= 0) {
> + printf("Possible failed disk: %d\n", disk);
> + }
> + if(disk == -2) {
> + printf("Failure detected, but disk unknown\n");
> + }
> break;
> }
> length -= chunk_size;
>
next prev parent reply other threads:[~2011-02-07 22:49 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-31 20:20 User space RAID-6 access Piergiorgio Sartor
2011-01-31 20:52 ` NeilBrown
2011-02-01 19:21 ` Piergiorgio Sartor
2011-02-01 20:14 ` John Robinson
2011-02-01 20:18 ` NeilBrown
2011-02-01 21:00 ` Piergiorgio Sartor
2011-02-05 17:33 ` Piergiorgio Sartor
2011-02-05 20:58 ` NeilBrown
2011-02-07 22:24 ` [PATCH] " Piergiorgio Sartor
2011-02-07 22:49 ` NeilBrown [this message]
2011-02-09 18:47 ` Piergiorgio Sartor
2011-02-17 6:23 ` NeilBrown
2011-02-17 20:01 ` Piergiorgio Sartor
2011-02-18 23:02 ` Piergiorgio Sartor
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110208094952.49745c7b@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=piergiorgio.sartor@nexgo.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).