Fwd: raid1 data corruption during resync

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Fwd: raid1 data corruption during resync
       [not found] <D4FE2D75-4208-48C9-A4D0-432F092E5AE9@redhat.com>
@ 2014-09-08 15:52 ` Brassow Jonathan
  2014-09-09  1:08   ` NeilBrown
  0 siblings, 1 reply; 17+ messages in thread
From: Brassow Jonathan @ 2014-09-08 15:52 UTC (permalink / raw)
  To: linux-raid@vger.kernel.org Raid; +Cc: NeilBrown



Begin forwarded message:

> From: Brassow Jonathan <jbrassow@redhat.com>
> Date: September 4, 2014 9:01:58 AM CDT
> To: NeilBrown <neilb@suse.de>
> Cc: Eivind Sarto <eivindsarto@gmail.com>, "linux-raid@vger.kernel.org Raid" <linux-raid@vger.kernel.org>
> Subject: Re: raid1 data corruption during resync
> 
> 
> On Sep 4, 2014, at 12:28 AM, NeilBrown wrote:
> 
>> 
>> Neither of these explain the hang you are seeing.
>> I note that the "md0-resync" thread isn't listed.  I don't suppose you know
>> what it was doing (stack trace)??
>> Also, had  "md: md0: sync done" appeared in syslog yet?
> 
> The sync has not yet completed (no message).  I'm not sure why the resync thread didn't automatically report, but I've grabbed the entire trace from the machine ('echo t > /proc/sysrq-trigger') and it appears there.  The traces are attached.  (Would you rather have something so large posted in-line?)

I didn't see this message come through yet.  I am resending it with only the trace you requested from the mdX_resync thread.  If you need the entire list of traces, I can try resending that.

 brassow

Sep  4 08:52:00 bp-01 kernel: mdX_resync      D 0000000000000008     0 12374      2 0x00000080
Sep  4 08:52:00 bp-01 kernel: ffff880207a5bb38 0000000000000046 0000000000000296 ffff88021727efc0
Sep  4 08:52:00 bp-01 kernel: ffff880207a58010 0000000000012bc0 0000000000012bc0 ffff880214de0f40
Sep  4 08:52:00 bp-01 kernel: ffff880207a5bb60 ffff88040171e178 ffff88040171e100 ffff880207a5bb58
Sep  4 08:52:00 bp-01 kernel: Call Trace:
Sep  4 08:52:00 bp-01 kernel: [<ffffffff81580549>] schedule+0x29/0x70
Sep  4 08:52:00 bp-01 kernel: [<ffffffffa038a122>] raise_barrier+0xe2/0x160 [raid1]
Sep  4 08:52:00 bp-01 kernel: [<ffffffff8108eb00>] ? bit_waitqueue+0xe0/0xe0
Sep  4 08:52:00 bp-01 kernel: [<ffffffffa038b701>] sync_request+0x161/0x9e0 [raid1]
Sep  4 08:52:00 bp-01 kernel: [<ffffffff8108ee13>] ? __wake_up+0x53/0x70
Sep  4 08:52:00 bp-01 kernel: [<ffffffff81460009>] md_do_sync+0x849/0xd40
Sep  4 08:52:00 bp-01 kernel: [<ffffffff8108847f>] ? put_prev_entity+0x2f/0x400
Sep  4 08:52:00 bp-01 kernel: [<ffffffff81460816>] md_thread+0x116/0x150
Sep  4 08:52:00 bp-01 kernel: [<ffffffff8158006e>] ? __schedule+0x34e/0x6e0
Sep  4 08:52:00 bp-01 kernel: [<ffffffff81460700>] ? md_rdev_init+0x110/0x110
Sep  4 08:52:00 bp-01 kernel: [<ffffffff8107083e>] kthread+0xce/0xf0
Sep  4 08:52:00 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
Sep  4 08:52:00 bp-01 kernel: [<ffffffff8158412c>] ret_from_fork+0x7c/0xb0
Sep  4 08:52:00 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-09-08 15:52 ` Fwd: raid1 data corruption during resync Brassow Jonathan
@ 2014-09-09  1:08   ` NeilBrown
  0 siblings, 0 replies; 17+ messages in thread
From: NeilBrown @ 2014-09-09  1:08 UTC (permalink / raw)
  To: Brassow Jonathan; +Cc: linux-raid@vger.kernel.org Raid

[-- Attachment #1: Type: text/plain, Size: 3950 bytes --]

On Mon, 8 Sep 2014 10:52:49 -0500 Brassow Jonathan <jbrassow@redhat.com>
wrote:

> 
> 
> Begin forwarded message:
> 
> > From: Brassow Jonathan <jbrassow@redhat.com>
> > Date: September 4, 2014 9:01:58 AM CDT
> > To: NeilBrown <neilb@suse.de>
> > Cc: Eivind Sarto <eivindsarto@gmail.com>, "linux-raid@vger.kernel.org Raid" <linux-raid@vger.kernel.org>
> > Subject: Re: raid1 data corruption during resync
> > 
> > 
> > On Sep 4, 2014, at 12:28 AM, NeilBrown wrote:
> > 
> >> 
> >> Neither of these explain the hang you are seeing.
> >> I note that the "md0-resync" thread isn't listed.  I don't suppose you know
> >> what it was doing (stack trace)??
> >> Also, had  "md: md0: sync done" appeared in syslog yet?
> > 
> > The sync has not yet completed (no message).  I'm not sure why the resync thread didn't automatically report, but I've grabbed the entire trace from the machine ('echo t > /proc/sysrq-trigger') and it appears there.  The traces are attached.  (Would you rather have something so large posted in-line?)
> 
> I didn't see this message come through yet.  I am resending it with only the trace you requested from the mdX_resync thread.  If you need the entire list of traces, I can try resending that.
> 
>  brassow
> 
> Sep  4 08:52:00 bp-01 kernel: mdX_resync      D 0000000000000008     0 12374      2 0x00000080
> Sep  4 08:52:00 bp-01 kernel: ffff880207a5bb38 0000000000000046 0000000000000296 ffff88021727efc0
> Sep  4 08:52:00 bp-01 kernel: ffff880207a58010 0000000000012bc0 0000000000012bc0 ffff880214de0f40
> Sep  4 08:52:00 bp-01 kernel: ffff880207a5bb60 ffff88040171e178 ffff88040171e100 ffff880207a5bb58
> Sep  4 08:52:00 bp-01 kernel: Call Trace:
> Sep  4 08:52:00 bp-01 kernel: [<ffffffff81580549>] schedule+0x29/0x70
> Sep  4 08:52:00 bp-01 kernel: [<ffffffffa038a122>] raise_barrier+0xe2/0x160 [raid1]
> Sep  4 08:52:00 bp-01 kernel: [<ffffffff8108eb00>] ? bit_waitqueue+0xe0/0xe0
> Sep  4 08:52:00 bp-01 kernel: [<ffffffffa038b701>] sync_request+0x161/0x9e0 [raid1]
> Sep  4 08:52:00 bp-01 kernel: [<ffffffff8108ee13>] ? __wake_up+0x53/0x70
> Sep  4 08:52:00 bp-01 kernel: [<ffffffff81460009>] md_do_sync+0x849/0xd40
> Sep  4 08:52:00 bp-01 kernel: [<ffffffff8108847f>] ? put_prev_entity+0x2f/0x400
> Sep  4 08:52:00 bp-01 kernel: [<ffffffff81460816>] md_thread+0x116/0x150
> Sep  4 08:52:00 bp-01 kernel: [<ffffffff8158006e>] ? __schedule+0x34e/0x6e0
> Sep  4 08:52:00 bp-01 kernel: [<ffffffff81460700>] ? md_rdev_init+0x110/0x110
> Sep  4 08:52:00 bp-01 kernel: [<ffffffff8107083e>] kthread+0xce/0xf0
> Sep  4 08:52:00 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
> Sep  4 08:52:00 bp-01 kernel: [<ffffffff8158412c>] ret_from_fork+0x7c/0xb0
> Sep  4 08:52:00 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
> 

I did get the original thanks, I guess it just didn't make it to the list as
well.  Probably there is a limit on attachment sizes which isn't entirely
unreasonable.

No brilliant ideas yet...
That fact that there are no kernel messages: no "sync done" or "redirecting
sector" or "unrecoverable I/O" is a little surprising...

It appears that conf->barrier is elevated, implying that there is a resync
request that is still in-flight.  However there I cannot think were it would
be.
It might help  if I could see the disassemble of raise_barrier() so I could
confirm that "raise_barrier+0xe2/0x160" is in the first or the second
"wait_event_lock_irq" call.  I assume it is in the first, and is waiting for
the request that kcopyd wants to submit, to complete.
It it were the second then it would be waiting for ->start_next_window
to increase.  That happens when allow_barrier() is called.  If that were
the blockage, it means that some normal write is in-flight, rather than
an sync request.  But I don't know where it could be either.

So - still mystified.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* raid1 data corruption during resync
@ 2014-08-29 19:29 Eivind Sarto
  2014-09-02 14:10 ` Brassow Jonathan
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Eivind Sarto @ 2014-08-29 19:29 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

I am seeing occasional data corruption during raid1 resync.
Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero.  It no longer does this.
It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
Once I validate that the fix indeed works, I will post a proper patch.
Do you have any feedback?

— drivers/md/raid1.c	2014-08-22 15:19:15.000000000 -0700
+++ /tmp/raid1.c	2014-08-29 12:07:51.000000000 -0700
@@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf 
 	 *    handling.
 	 */
 	wait_event_lock_irq(conf->wait_barrier,
-			    !conf->array_frozen &&
+			    !conf->array_frozen && !conf->nr_pending &&
 			    conf->barrier < RESYNC_DEPTH &&
 			    (conf->start_next_window >=
 			     conf->next_resync + RESYNC_SECTORS),

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-08-29 19:29 Eivind Sarto
@ 2014-09-02 14:10 ` Brassow Jonathan
  2014-09-02 16:43   ` Eivind Sarto
  2014-09-02 17:04   ` Eivind Sarto
  2014-09-02 16:59 ` Brassow Jonathan
  2014-09-02 19:24 ` Brassow Jonathan
  2 siblings, 2 replies; 17+ messages in thread
From: Brassow Jonathan @ 2014-09-02 14:10 UTC (permalink / raw)
  To: Eivind Sarto; +Cc: NeilBrown, linux-raid

There is absolutely an issue with the mentioned commit.

We are seeing symptoms of a different sort.  Our testing group is doing device fault testing.  (They are using LVM to create the RAID devices, but as you know, that uses the same underlying kernel code.)  We encounter situations where the resync thread is stuck waiting for a barrier to come down, but it never does.  All I/O to the RAID1 device blocks.

There are a few different variations to how the problem manifests itself, but I have bisected the kernel to this commit (79ef3a8).

Unfortunately, it does take quite some time (hours) to reproduce the issue with our test scripts.  I will try testing this proposed patch while I try to figure out what the patch is doing and what might have gone wrong.

 brassow


On Aug 29, 2014, at 2:29 PM, Eivind Sarto wrote:

> I am seeing occasional data corruption during raid1 resync.
> Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
> Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero.  It no longer does this.
> It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
> Once I validate that the fix indeed works, I will post a proper patch.
> Do you have any feedback?
> 
> — drivers/md/raid1.c	2014-08-22 15:19:15.000000000 -0700
> +++ /tmp/raid1.c	2014-08-29 12:07:51.000000000 -0700
> @@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf 
> 	 *    handling.
> 	 */
> 	wait_event_lock_irq(conf->wait_barrier,
> -			    !conf->array_frozen &&
> +			    !conf->array_frozen && !conf->nr_pending &&
> 			    conf->barrier < RESYNC_DEPTH &&
> 			    (conf->start_next_window >=
> 			     conf->next_resync + RESYNC_SECTORS),
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-09-02 14:10 ` Brassow Jonathan
@ 2014-09-02 16:43   ` Eivind Sarto
  2014-09-02 17:04   ` Eivind Sarto
  1 sibling, 0 replies; 17+ messages in thread
From: Eivind Sarto @ 2014-09-02 16:43 UTC (permalink / raw)
  To: Brassow Jonathan; +Cc: Eivind Sarto, NeilBrown, linux-raid


On Sep 2, 2014, at 7:10 AM, Brassow Jonathan <jbrassow@redhat.com> wrote:

> There is absolutely an issue with the mentioned commit.
> 
> We are seeing symptoms of a different sort.  Our testing group is doing device fault testing.  (They are using LVM to create the RAID devices, but as you know, that uses the same underlying kernel code.)  We encounter situations where the resync thread is stuck waiting for a barrier to come down, but it never does.  All I/O to the RAID1 device blocks.
> 
> There are a few different variations to how the problem manifests itself, but I have bisected the kernel to this commit (79ef3a8).
> 
> Unfortunately, it does take quite some time (hours) to reproduce the issue with our test scripts.  I will try testing this proposed patch while I try to figure out what the patch is doing and what might have gone wrong.
> 
> brassow
> 
> 
> On Aug 29, 2014, at 2:29 PM, Eivind Sarto wrote:
> 
>> I am seeing occasional data corruption during raid1 resync.
>> Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
>> Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero.  It no longer does this.
>> It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
>> Once I validate that the fix indeed works, I will post a proper patch.
>> Do you have any feedback?
>> 
>> — drivers/md/raid1.c	2014-08-22 15:19:15.000000000 -0700
>> +++ /tmp/raid1.c	2014-08-29 12:07:51.000000000 -0700
>> @@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf 
>> 	 *    handling.
>> 	 */
>> 	wait_event_lock_irq(conf->wait_barrier,
>> -			    !conf->array_frozen &&
>> +			    !conf->array_frozen && !conf->nr_pending &&
>> 			    conf->barrier < RESYNC_DEPTH &&
>> 			    (conf->start_next_window >=
>> 			     conf->next_resync + RESYNC_SECTORS),
>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

The suggested patch is not the right thing to do.  I got confused about the big comment  about /* Barriers…. */  The comment suggests that conf->nr_pending is used for the implementation of barriers.  This is not the case.  The new barrier code no longer uses nr_pending.
nr_pending is now only used when freezing the array.
Sorry for the confusion.
I back-ported the new barrier code to an older version of linux.  I could have introduced a bug during the back-porting.  I will try and upgrade my system to a newer kernel and see if I can reproduce the corruption.

-eivind--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-09-02 14:10 ` Brassow Jonathan
  2014-09-02 16:43   ` Eivind Sarto
@ 2014-09-02 17:04   ` Eivind Sarto
  1 sibling, 0 replies; 17+ messages in thread
From: Eivind Sarto @ 2014-09-02 17:04 UTC (permalink / raw)
  To: Brassow Jonathan; +Cc: Eivind Sarto, NeilBrown, linux-raid


On Sep 2, 2014, at 7:10 AM, Brassow Jonathan <jbrassow@redhat.com> wrote:

> There is absolutely an issue with the mentioned commit.
> 
> We are seeing symptoms of a different sort.  Our testing group is doing device fault testing.  (They are using LVM to create the RAID devices, but as you know, that uses the same underlying kernel code.)  We encounter situations where the resync thread is stuck waiting for a barrier to come down, but it never does.  All I/O to the RAID1 device blocks.
> 
> There are a few different variations to how the problem manifests itself, but I have bisected the kernel to this commit (79ef3a8).
> 
> Unfortunately, it does take quite some time (hours) to reproduce the issue with our test scripts.  I will try testing this proposed patch while I try to figure out what the patch is doing and what might have gone wrong.
> 
> brassow
> 
> 
> On Aug 29, 2014, at 2:29 PM, Eivind Sarto wrote:
> 
>> I am seeing occasional data corruption during raid1 resync.
>> Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
>> Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero.  It no longer does this.
>> It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
>> Once I validate that the fix indeed works, I will post a proper patch.
>> Do you have any feedback?
>> 
>> — drivers/md/raid1.c	2014-08-22 15:19:15.000000000 -0700
>> +++ /tmp/raid1.c	2014-08-29 12:07:51.000000000 -0700
>> @@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf 
>> 	 *    handling.
>> 	 */
>> 	wait_event_lock_irq(conf->wait_barrier,
>> -			    !conf->array_frozen &&
>> +			    !conf->array_frozen && !conf->nr_pending &&
>> 			    conf->barrier < RESYNC_DEPTH &&
>> 			    (conf->start_next_window >=
>> 			     conf->next_resync + RESYNC_SECTORS),
>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Yes, you are correct.  I think above patch would just return the resync to the old implementation.
No resync-window.  Just either one of resync-IO or user-IO at any given time for the entire array.
As I said, the comment in the resync code no longer matches the actual implementation.

-eivind--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-08-29 19:29 Eivind Sarto
  2014-09-02 14:10 ` Brassow Jonathan
@ 2014-09-02 16:59 ` Brassow Jonathan
  2014-09-02 19:24 ` Brassow Jonathan
  2 siblings, 0 replies; 17+ messages in thread
From: Brassow Jonathan @ 2014-09-02 16:59 UTC (permalink / raw)
  To: Eivind Sarto; +Cc: NeilBrown, linux-raid@vger.kernel.org Raid, majianpeng

On Aug 29, 2014, at 2:29 PM, Eivind Sarto wrote:

> I am seeing occasional data corruption during raid1 resync.
> Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
> Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero.  It no longer does this.
> It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
> Once I validate that the fix indeed works, I will post a proper patch.
> Do you have any feedback?
> 
> — drivers/md/raid1.c	2014-08-22 15:19:15.000000000 -0700
> +++ /tmp/raid1.c	2014-08-29 12:07:51.000000000 -0700
> @@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf 
> 	 *    handling.
> 	 */
> 	wait_event_lock_irq(conf->wait_barrier,
> -			    !conf->array_frozen &&
> +			    !conf->array_frozen && !conf->nr_pending &&
> 			    conf->barrier < RESYNC_DEPTH &&
> 			    (conf->start_next_window >=
> 			     conf->next_resync + RESYNC_SECTORS),

I've been through the description of the mentioned commit and briefly looked at the code.  I don't understand the changes or the algorithm quite yet, but wouldn't the change you are proposing here eliminate the benefit that the commit was trying to provide?  Wouldn't it simply return the behavior to the old way?

I think the point of the commit is to allow nominal I/O to run concurrently to resync I/O as long as it is not done to 'window' in which the resync I/O is running.  By adding the check for nr_pending, wouldn't you be blocking recovery from taking place if there are any nominal I/O's outstanding - even if they are not to the same area of the device?

CC'ing patch author to get his thoughts as well.

 brassow--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-08-29 19:29 Eivind Sarto
  2014-09-02 14:10 ` Brassow Jonathan
  2014-09-02 16:59 ` Brassow Jonathan
@ 2014-09-02 19:24 ` Brassow Jonathan
  2014-09-02 22:07   ` Eivind Sarto
  2 siblings, 1 reply; 17+ messages in thread
From: Brassow Jonathan @ 2014-09-02 19:24 UTC (permalink / raw)
  To: Eivind Sarto; +Cc: NeilBrown, linux-raid

On Aug 29, 2014, at 2:29 PM, Eivind Sarto wrote:

> I am seeing occasional data corruption during raid1 resync.
> Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
> Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero.  It no longer does this.
> It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
> Once I validate that the fix indeed works, I will post a proper patch.
> Do you have any feedback?
> 
> — drivers/md/raid1.c	2014-08-22 15:19:15.000000000 -0700
> +++ /tmp/raid1.c	2014-08-29 12:07:51.000000000 -0700
> @@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf 
> 	 *    handling.
> 	 */
> 	wait_event_lock_irq(conf->wait_barrier,
> -			    !conf->array_frozen &&
> +			    !conf->array_frozen && !conf->nr_pending &&
> 			    conf->barrier < RESYNC_DEPTH &&
> 			    (conf->start_next_window >=
> 			     conf->next_resync + RESYNC_SECTORS),

This patch does not work - at least, it doesn't fix the issues I'm seeing.  My system hangs (in various places, like the resync thread) after commit 79ef3a8.  When testing this patch, I also added some code to dm-raid.c to allow me to print-out some of the variables when I encounter a problem.  After applying this patch and printing the variables, I see:
Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: start_next_window = 12288
Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: current_window_requests = -46
5257
Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: next_window_requests = -11562
Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_pending = 0
Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_waiting = 0
Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_queued  = 0
Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: barrier = 1
Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: array_frozen = 0

Some of those values look pretty bizarre to me and suggest the accounting is pretty messed up.

 brassow

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-09-02 19:24 ` Brassow Jonathan
@ 2014-09-02 22:07   ` Eivind Sarto
  2014-09-02 22:14     ` Brassow Jonathan
  2014-09-02 23:55     ` NeilBrown
  0 siblings, 2 replies; 17+ messages in thread
From: Eivind Sarto @ 2014-09-02 22:07 UTC (permalink / raw)
  To: Brassow Jonathan; +Cc: Eivind Sarto, NeilBrown, linux-raid


On Sep 2, 2014, at 12:24 PM, Brassow Jonathan <jbrassow@redhat.com> wrote:

> 
> On Aug 29, 2014, at 2:29 PM, Eivind Sarto wrote:
> 
>> I am seeing occasional data corruption during raid1 resync.
>> Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
>> Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero.  It no longer does this.
>> It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
>> Once I validate that the fix indeed works, I will post a proper patch.
>> Do you have any feedback?
>> 
>> — drivers/md/raid1.c	2014-08-22 15:19:15.000000000 -0700
>> +++ /tmp/raid1.c	2014-08-29 12:07:51.000000000 -0700
>> @@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf 
>> 	 *    handling.
>> 	 */
>> 	wait_event_lock_irq(conf->wait_barrier,
>> -			    !conf->array_frozen &&
>> +			    !conf->array_frozen && !conf->nr_pending &&
>> 			    conf->barrier < RESYNC_DEPTH &&
>> 			    (conf->start_next_window >=
>> 			     conf->next_resync + RESYNC_SECTORS),
> 
> This patch does not work - at least, it doesn't fix the issues I'm seeing.  My system hangs (in various places, like the resync thread) after commit 79ef3a8.  When testing this patch, I also added some code to dm-raid.c to allow me to print-out some of the variables when I encounter a problem.  After applying this patch and printing the variables, I see:
> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: start_next_window = 12288
> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: current_window_requests = -46
> 5257
> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: next_window_requests = -11562
> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_pending = 0
> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_waiting = 0
> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_queued  = 0
> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: barrier = 1
> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: array_frozen = 0
> 
> Some of those values look pretty bizarre to me and suggest the accounting is pretty messed up.
> 
> brassow
> 

After reviewing commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 I notice that wait_barrier() will now only exclude writes.  User reads are not excluded even if the fall within the resync window.
The old implementation used to exclude both reads and writes while resync-IO is active.
Could this be a cause of data corruption?

-eivind--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-09-02 22:07   ` Eivind Sarto
@ 2014-09-02 22:14     ` Brassow Jonathan
  2014-09-02 23:55     ` NeilBrown
  1 sibling, 0 replies; 17+ messages in thread
From: Brassow Jonathan @ 2014-09-02 22:14 UTC (permalink / raw)
  To: Eivind Sarto; +Cc: NeilBrown, linux-raid@vger.kernel.org Raid


On Sep 2, 2014, at 5:07 PM, Eivind Sarto wrote:

> 
> On Sep 2, 2014, at 12:24 PM, Brassow Jonathan <jbrassow@redhat.com> wrote:
> 
>> 
>> On Aug 29, 2014, at 2:29 PM, Eivind Sarto wrote:
>> 
>>> I am seeing occasional data corruption during raid1 resync.
>>> Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
>>> Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero.  It no longer does this.
>>> It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
>>> Once I validate that the fix indeed works, I will post a proper patch.
>>> Do you have any feedback?
>>> 
>>> — drivers/md/raid1.c	2014-08-22 15:19:15.000000000 -0700
>>> +++ /tmp/raid1.c	2014-08-29 12:07:51.000000000 -0700
>>> @@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf 
>>> 	 *    handling.
>>> 	 */
>>> 	wait_event_lock_irq(conf->wait_barrier,
>>> -			    !conf->array_frozen &&
>>> +			    !conf->array_frozen && !conf->nr_pending &&
>>> 			    conf->barrier < RESYNC_DEPTH &&
>>> 			    (conf->start_next_window >=
>>> 			     conf->next_resync + RESYNC_SECTORS),
>> 
>> This patch does not work - at least, it doesn't fix the issues I'm seeing.  My system hangs (in various places, like the resync thread) after commit 79ef3a8.  When testing this patch, I also added some code to dm-raid.c to allow me to print-out some of the variables when I encounter a problem.  After applying this patch and printing the variables, I see:
>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: start_next_window = 12288
>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: current_window_requests = -46
>> 5257
>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: next_window_requests = -11562
>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_pending = 0
>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_waiting = 0
>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_queued  = 0
>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: barrier = 1
>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: array_frozen = 0
>> 
>> Some of those values look pretty bizarre to me and suggest the accounting is pretty messed up.
>> 
>> brassow
>> 
> 
> After reviewing commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 I notice that wait_barrier() will now only exclude writes.  User reads are not excluded even if the fall within the resync window.
> The old implementation used to exclude both reads and writes while resync-IO is active.
> Could this be a cause of data corruption?

Not sure yet.  It is also worth mentioning that a follow-on commit (8b592b6) was meant to fix problems with request counting.  I did not have that patch applied when I gathered the above info.

 brassow

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-09-02 22:07   ` Eivind Sarto
  2014-09-02 22:14     ` Brassow Jonathan
@ 2014-09-02 23:55     ` NeilBrown
  2014-09-03  0:48       ` Eivind Sarto
  2014-09-03  1:18       ` Brassow Jonathan
  1 sibling, 2 replies; 17+ messages in thread
From: NeilBrown @ 2014-09-02 23:55 UTC (permalink / raw)
  To: Eivind Sarto; +Cc: Brassow Jonathan, linux-raid

[-- Attachment #1: Type: text/plain, Size: 3245 bytes --]

On Tue, 2 Sep 2014 15:07:26 -0700 Eivind Sarto <eivindsarto@gmail.com> wrote:

> 
> On Sep 2, 2014, at 12:24 PM, Brassow Jonathan <jbrassow@redhat.com> wrote:
> 
> > 
> > On Aug 29, 2014, at 2:29 PM, Eivind Sarto wrote:
> > 
> >> I am seeing occasional data corruption during raid1 resync.
> >> Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
> >> Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero.  It no longer does this.
> >> It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
> >> Once I validate that the fix indeed works, I will post a proper patch.
> >> Do you have any feedback?
> >> 
> >> — drivers/md/raid1.c	2014-08-22 15:19:15.000000000 -0700
> >> +++ /tmp/raid1.c	2014-08-29 12:07:51.000000000 -0700
> >> @@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf 
> >> 	 *    handling.
> >> 	 */
> >> 	wait_event_lock_irq(conf->wait_barrier,
> >> -			    !conf->array_frozen &&
> >> +			    !conf->array_frozen && !conf->nr_pending &&
> >> 			    conf->barrier < RESYNC_DEPTH &&
> >> 			    (conf->start_next_window >=
> >> 			     conf->next_resync + RESYNC_SECTORS),
> > 
> > This patch does not work - at least, it doesn't fix the issues I'm seeing.  My system hangs (in various places, like the resync thread) after commit 79ef3a8.  When testing this patch, I also added some code to dm-raid.c to allow me to print-out some of the variables when I encounter a problem.  After applying this patch and printing the variables, I see:
> > Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: start_next_window = 12288
> > Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: current_window_requests = -46
> > 5257
> > Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: next_window_requests = -11562
> > Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_pending = 0
> > Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_waiting = 0
> > Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_queued  = 0
> > Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: barrier = 1
> > Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: array_frozen = 0
> > 
> > Some of those values look pretty bizarre to me and suggest the accounting is pretty messed up.
> > 
> > brassow
> > 
> 
> After reviewing commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 I notice that wait_barrier() will now only exclude writes.  User reads are not excluded even if the fall within the resync window.
> The old implementation used to exclude both reads and writes while resync-IO is active.
> Could this be a cause of data corruption?
> 

Could be.
From read_balance:

	if (conf->mddev->recovery_cp < MaxSector &&
	    (this_sector + sectors >= conf->next_resync))
		choose_first = 1;
	else
		choose_first = 0;

This used to be safe because a read immediately before next_resync would wait
until all resync requests completed.  But now that read requests don't block
it isn't safe.
Probably best to make this:

    choose_first = (conf->mddev->recovery_cp < this_sector + sectors);

Can you test that?

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-09-02 23:55     ` NeilBrown
@ 2014-09-03  0:48       ` Eivind Sarto
  2014-09-03  1:18       ` Brassow Jonathan
  1 sibling, 0 replies; 17+ messages in thread
From: Eivind Sarto @ 2014-09-03  0:48 UTC (permalink / raw)
  To: NeilBrown; +Cc: Eivind Sarto, Brassow Jonathan, linux-raid


On Sep 2, 2014, at 4:55 PM, NeilBrown <neilb@suse.de> wrote:

> On Tue, 2 Sep 2014 15:07:26 -0700 Eivind Sarto <eivindsarto@gmail.com> wrote:
> 
>> 
>> On Sep 2, 2014, at 12:24 PM, Brassow Jonathan <jbrassow@redhat.com> wrote:
>> 
>>> 
>>> On Aug 29, 2014, at 2:29 PM, Eivind Sarto wrote:
>>> 
>>>> I am seeing occasional data corruption during raid1 resync.
>>>> Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
>>>> Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero.  It no longer does this.
>>>> It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
>>>> Once I validate that the fix indeed works, I will post a proper patch.
>>>> Do you have any feedback?
>>>> 
>>>> — drivers/md/raid1.c	2014-08-22 15:19:15.000000000 -0700
>>>> +++ /tmp/raid1.c	2014-08-29 12:07:51.000000000 -0700
>>>> @@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf 
>>>> 	 *    handling.
>>>> 	 */
>>>> 	wait_event_lock_irq(conf->wait_barrier,
>>>> -			    !conf->array_frozen &&
>>>> +			    !conf->array_frozen && !conf->nr_pending &&
>>>> 			    conf->barrier < RESYNC_DEPTH &&
>>>> 			    (conf->start_next_window >=
>>>> 			     conf->next_resync + RESYNC_SECTORS),
>>> 
>>> This patch does not work - at least, it doesn't fix the issues I'm seeing.  My system hangs (in various places, like the resync thread) after commit 79ef3a8.  When testing this patch, I also added some code to dm-raid.c to allow me to print-out some of the variables when I encounter a problem.  After applying this patch and printing the variables, I see:
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: start_next_window = 12288
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: current_window_requests = -46
>>> 5257
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: next_window_requests = -11562
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_pending = 0
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_waiting = 0
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_queued  = 0
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: barrier = 1
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: array_frozen = 0
>>> 
>>> Some of those values look pretty bizarre to me and suggest the accounting is pretty messed up.
>>> 
>>> brassow
>>> 
>> 
>> After reviewing commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 I notice that wait_barrier() will now only exclude writes.  User reads are not excluded even if the fall within the resync window.
>> The old implementation used to exclude both reads and writes while resync-IO is active.
>> Could this be a cause of data corruption?
>> 
> 
> Could be.
> From read_balance:
> 
> 	if (conf->mddev->recovery_cp < MaxSector &&
> 	    (this_sector + sectors >= conf->next_resync))
> 		choose_first = 1;
> 	else
> 		choose_first = 0;
> 
> This used to be safe because a read immediately before next_resync would wait
> until all resync requests completed.  But now that read requests don't block
> it isn't safe.
> Probably best to make this:
> 
>    choose_first = (conf->mddev->recovery_cp < this_sector + sectors);
> 
> Can you test that?
> 
> Thanks,
> NeilBrown

I’ll give it a try.
-eivind

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-09-02 23:55     ` NeilBrown
  2014-09-03  0:48       ` Eivind Sarto
@ 2014-09-03  1:18       ` Brassow Jonathan
  2014-09-03  1:31         ` NeilBrown
  1 sibling, 1 reply; 17+ messages in thread
From: Brassow Jonathan @ 2014-09-03  1:18 UTC (permalink / raw)
  To: NeilBrown; +Cc: Eivind Sarto, linux-raid


On Sep 2, 2014, at 6:55 PM, NeilBrown wrote:

> On Tue, 2 Sep 2014 15:07:26 -0700 Eivind Sarto <eivindsarto@gmail.com> wrote:
> 
>> 
>> On Sep 2, 2014, at 12:24 PM, Brassow Jonathan <jbrassow@redhat.com> wrote:
>> 
>>> 
>>> On Aug 29, 2014, at 2:29 PM, Eivind Sarto wrote:
>>> 
>>>> I am seeing occasional data corruption during raid1 resync.
>>>> Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
>>>> Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero.  It no longer does this.
>>>> It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
>>>> Once I validate that the fix indeed works, I will post a proper patch.
>>>> Do you have any feedback?
>>>> 
>>>> — drivers/md/raid1.c	2014-08-22 15:19:15.000000000 -0700
>>>> +++ /tmp/raid1.c	2014-08-29 12:07:51.000000000 -0700
>>>> @@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf 
>>>> 	 *    handling.
>>>> 	 */
>>>> 	wait_event_lock_irq(conf->wait_barrier,
>>>> -			    !conf->array_frozen &&
>>>> +			    !conf->array_frozen && !conf->nr_pending &&
>>>> 			    conf->barrier < RESYNC_DEPTH &&
>>>> 			    (conf->start_next_window >=
>>>> 			     conf->next_resync + RESYNC_SECTORS),
>>> 
>>> This patch does not work - at least, it doesn't fix the issues I'm seeing.  My system hangs (in various places, like the resync thread) after commit 79ef3a8.  When testing this patch, I also added some code to dm-raid.c to allow me to print-out some of the variables when I encounter a problem.  After applying this patch and printing the variables, I see:
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: start_next_window = 12288
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: current_window_requests = -46
>>> 5257
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: next_window_requests = -11562
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_pending = 0
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_waiting = 0
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_queued  = 0
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: barrier = 1
>>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: array_frozen = 0
>>> 
>>> Some of those values look pretty bizarre to me and suggest the accounting is pretty messed up.
>>> 
>>> brassow
>>> 
>> 
>> After reviewing commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 I notice that wait_barrier() will now only exclude writes.  User reads are not excluded even if the fall within the resync window.
>> The old implementation used to exclude both reads and writes while resync-IO is active.
>> Could this be a cause of data corruption?
>> 
> 
> Could be.
> From read_balance:
> 
> 	if (conf->mddev->recovery_cp < MaxSector &&
> 	    (this_sector + sectors >= conf->next_resync))
> 		choose_first = 1;
> 	else
> 		choose_first = 0;
> 
> This used to be safe because a read immediately before next_resync would wait
> until all resync requests completed.  But now that read requests don't block
> it isn't safe.
> Probably best to make this:
> 
>    choose_first = (conf->mddev->recovery_cp < this_sector + sectors);
> 
> Can you test that?

Unfortunately, I doubt very much that that will solve the issue that I am seeing.  (Although, I'd welcome this fix also.)

I have predominantly been testing RHEL kernels, but I've seen this with the latest upstream kernel as well.  When doing fault testing, like failing a drive while the resync is happening, the processes can become hung.  Here's one example, but they vary:

INFO: task mdX_resync:11334 blocked for more than 120 seconds.                  
      Not tainted 2.6.32-497.el6.x86_64 #1                                      
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.       
mdX_resync    D 0000000000000000     0 11334      2 0x00000080                  
 ffff8801fada5be0 0000000000000046 ffff88021a3bcc00 0000000300000001            
 ffff880218b92200 ffff880218668f50 0000000000000282 0000000000000003            
 0000000000000001 0000000000000282 ffff88021bfb45f8 ffff8801fada5fd8            
Call Trace:                                                                     
 [<ffffffff8140e439>] ? md_wakeup_thread+0x39/0x70                              
 [<ffffffffa02b81dd>] sync_request+0x26d/0xb90 [raid1]                          
 [<ffffffff81064b90>] ? default_wake_function+0x0/0x20                          
 [<ffffffff810577e9>] ? __wake_up_common+0x59/0x90                              
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40                       
 [<ffffffff8105bd23>] ? __wake_up+0x53/0x70                                     
 [<ffffffff81414a47>] md_do_sync+0x6c7/0xd20                                    
 [<ffffffff814154f5>] md_thread+0x115/0x150                                     
 [<ffffffff814153e0>] ? md_thread+0x0/0x150                                     
 [<ffffffff8109e66e>] kthread+0x9e/0xc0                                         
 [<ffffffff8100c20a>] child_rip+0xa/0x20                                        
 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0                                        
 [<ffffffff8100c200>] ? child_rip+0x0/0x20                                      


and...


INFO: task kjournald:10732 blocked for more than 120 seconds.                   
      Not tainted 2.6.32-497.el6.x86_64 #1                                      
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.       
kjournald     D 0000000000000008     0 10732      2 0x00000080                  
 ffff8801d71238b0 0000000000000046 0000000000000000 0000000300000001            
 ffff880218b92200 ffff880218668f50 000003783fd7424b 0000000000000003            
 0000000000000001 000000010035a163 ffff880219135af8 ffff8801d7123fd8            
Call Trace:                                                                     
 [<ffffffffa02b463f>] wait_barrier+0xcf/0x280 [raid1]                           
 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40                       
 [<ffffffffa02b71bf>] make_request+0x8f/0xe40 [raid1]                           
 [<ffffffff81126af5>] ? mempool_alloc_slab+0x15/0x20                            
 [<ffffffff81126c93>] ? mempool_alloc+0x63/0x140                                
 [<ffffffff811c9d6b>] ? bio_alloc_bioset+0x5b/0xf0                              
 [<ffffffffa02cf018>] raid_map+0x18/0x20 [dm_raid]

Bisecting has revealed that the commit that is the subject of this thread is the root of the problem.  It does take a long time to reproduce the issue with the scripts that I have, but if there is info I can get you, let me know.

 brassow


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-09-03  1:18       ` Brassow Jonathan
@ 2014-09-03  1:31         ` NeilBrown
  2014-09-03  1:45           ` Brassow Jonathan
  0 siblings, 1 reply; 17+ messages in thread
From: NeilBrown @ 2014-09-03  1:31 UTC (permalink / raw)
  To: Brassow Jonathan; +Cc: Eivind Sarto, linux-raid

[-- Attachment #1: Type: text/plain, Size: 7491 bytes --]

On Tue, 2 Sep 2014 20:18:06 -0500 Brassow Jonathan <jbrassow@redhat.com>
wrote:

> 
> On Sep 2, 2014, at 6:55 PM, NeilBrown wrote:
> 
> > On Tue, 2 Sep 2014 15:07:26 -0700 Eivind Sarto <eivindsarto@gmail.com> wrote:
> > 
> >> 
> >> On Sep 2, 2014, at 12:24 PM, Brassow Jonathan <jbrassow@redhat.com> wrote:
> >> 
> >>> 
> >>> On Aug 29, 2014, at 2:29 PM, Eivind Sarto wrote:
> >>> 
> >>>> I am seeing occasional data corruption during raid1 resync.
> >>>> Reviewing the raid1 code, I suspect that commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 introduced a bug.
> >>>> Prior to this commit raise_barrier() used to wait for conf->nr_pending to become zero.  It no longer does this.
> >>>> It is not easy to reproduce the corruption, so I wanted to ask about the following potential fix while I am still testing it.
> >>>> Once I validate that the fix indeed works, I will post a proper patch.
> >>>> Do you have any feedback?
> >>>> 
> >>>> — drivers/md/raid1.c	2014-08-22 15:19:15.000000000 -0700
> >>>> +++ /tmp/raid1.c	2014-08-29 12:07:51.000000000 -0700
> >>>> @@ -851,7 +851,7 @@ static void raise_barrier(struct r1conf 
> >>>> 	 *    handling.
> >>>> 	 */
> >>>> 	wait_event_lock_irq(conf->wait_barrier,
> >>>> -			    !conf->array_frozen &&
> >>>> +			    !conf->array_frozen && !conf->nr_pending &&
> >>>> 			    conf->barrier < RESYNC_DEPTH &&
> >>>> 			    (conf->start_next_window >=
> >>>> 			     conf->next_resync + RESYNC_SECTORS),
> >>> 
> >>> This patch does not work - at least, it doesn't fix the issues I'm seeing.  My system hangs (in various places, like the resync thread) after commit 79ef3a8.  When testing this patch, I also added some code to dm-raid.c to allow me to print-out some of the variables when I encounter a problem.  After applying this patch and printing the variables, I see:
> >>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: start_next_window = 12288
> >>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: current_window_requests = -46
> >>> 5257
> >>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: next_window_requests = -11562
> >>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_pending = 0
> >>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_waiting = 0
> >>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: nr_queued  = 0
> >>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: barrier = 1
> >>> Sep  2 14:04:15 bp-01 kernel: device-mapper: raid: array_frozen = 0
> >>> 
> >>> Some of those values look pretty bizarre to me and suggest the accounting is pretty messed up.
> >>> 
> >>> brassow
> >>> 
> >> 
> >> After reviewing commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 I notice that wait_barrier() will now only exclude writes.  User reads are not excluded even if the fall within the resync window.
> >> The old implementation used to exclude both reads and writes while resync-IO is active.
> >> Could this be a cause of data corruption?
> >> 
> > 
> > Could be.
> > From read_balance:
> > 
> > 	if (conf->mddev->recovery_cp < MaxSector &&
> > 	    (this_sector + sectors >= conf->next_resync))
> > 		choose_first = 1;
> > 	else
> > 		choose_first = 0;
> > 
> > This used to be safe because a read immediately before next_resync would wait
> > until all resync requests completed.  But now that read requests don't block
> > it isn't safe.
> > Probably best to make this:
> > 
> >    choose_first = (conf->mddev->recovery_cp < this_sector + sectors);
> > 
> > Can you test that?
> 
> Unfortunately, I doubt very much that that will solve the issue that I am seeing.  (Although, I'd welcome this fix also.)
> 
> I have predominantly been testing RHEL kernels, but I've seen this with the latest upstream kernel as well.  When doing fault testing, like failing a drive while the resync is happening, the processes can become hung.  Here's one example, but they vary:
> 
> INFO: task mdX_resync:11334 blocked for more than 120 seconds.                  
>       Not tainted 2.6.32-497.el6.x86_64 #1                                      
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.       
> mdX_resync    D 0000000000000000     0 11334      2 0x00000080                  
>  ffff8801fada5be0 0000000000000046 ffff88021a3bcc00 0000000300000001            
>  ffff880218b92200 ffff880218668f50 0000000000000282 0000000000000003            
>  0000000000000001 0000000000000282 ffff88021bfb45f8 ffff8801fada5fd8            
> Call Trace:                                                                     
>  [<ffffffff8140e439>] ? md_wakeup_thread+0x39/0x70                              
>  [<ffffffffa02b81dd>] sync_request+0x26d/0xb90 [raid1]                          
>  [<ffffffff81064b90>] ? default_wake_function+0x0/0x20                          
>  [<ffffffff810577e9>] ? __wake_up_common+0x59/0x90                              
>  [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40                       
>  [<ffffffff8105bd23>] ? __wake_up+0x53/0x70                                     
>  [<ffffffff81414a47>] md_do_sync+0x6c7/0xd20                                    
>  [<ffffffff814154f5>] md_thread+0x115/0x150                                     
>  [<ffffffff814153e0>] ? md_thread+0x0/0x150                                     
>  [<ffffffff8109e66e>] kthread+0x9e/0xc0                                         
>  [<ffffffff8100c20a>] child_rip+0xa/0x20                                        
>  [<ffffffff8109e5d0>] ? kthread+0x0/0xc0                                        
>  [<ffffffff8100c200>] ? child_rip+0x0/0x20                                      
> 
> 
> and...
> 
> 
> INFO: task kjournald:10732 blocked for more than 120 seconds.                   
>       Not tainted 2.6.32-497.el6.x86_64 #1                                      
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.       
> kjournald     D 0000000000000008     0 10732      2 0x00000080                  
>  ffff8801d71238b0 0000000000000046 0000000000000000 0000000300000001            
>  ffff880218b92200 ffff880218668f50 000003783fd7424b 0000000000000003            
>  0000000000000001 000000010035a163 ffff880219135af8 ffff8801d7123fd8            
> Call Trace:                                                                     
>  [<ffffffffa02b463f>] wait_barrier+0xcf/0x280 [raid1]                           
>  [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40                       
>  [<ffffffffa02b71bf>] make_request+0x8f/0xe40 [raid1]                           
>  [<ffffffff81126af5>] ? mempool_alloc_slab+0x15/0x20                            
>  [<ffffffff81126c93>] ? mempool_alloc+0x63/0x140                                
>  [<ffffffff811c9d6b>] ? bio_alloc_bioset+0x5b/0xf0                              
>  [<ffffffffa02cf018>] raid_map+0x18/0x20 [dm_raid]
> 
> Bisecting has revealed that the commit that is the subject of this thread is the root of the problem.  It does take a long time to reproduce the issue with the scripts that I have, but if there is info I can get you, let me know.
> 
>  brassow

Thanks for the details Jon.  I don't suppose you have a stack trace from an
upstream kernel?  Not that I doubt you have one, and I suspect it is very
similar.  But I've learnt not to trust "suspect" and "presume" and "no doubt".

I'll have a look anyway.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-09-03  1:31         ` NeilBrown
@ 2014-09-03  1:45           ` Brassow Jonathan
  2014-09-03 21:39             ` Brassow Jonathan
  0 siblings, 1 reply; 17+ messages in thread
From: Brassow Jonathan @ 2014-09-03  1:45 UTC (permalink / raw)
  To: NeilBrown; +Cc: Eivind Sarto, linux-raid


On Sep 2, 2014, at 8:31 PM, NeilBrown wrote:

> Thanks for the details Jon.  I don't suppose you have a stack trace from an
> upstream kernel?  Not that I doubt you have one, and I suspect it is very
> similar.  But I've learnt not to trust "suspect" and "presume" and "no doubt".
> 
> I'll have a look anyway.

Indeed - me too.  I'll get started on recreating it on the latest upstream kernel now.  It does take several hours to reproduce on average, so I likely won't send the update until tomorrow.

 brassow


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-09-03  1:45           ` Brassow Jonathan
@ 2014-09-03 21:39             ` Brassow Jonathan
  2014-09-04  5:28               ` NeilBrown
  0 siblings, 1 reply; 17+ messages in thread
From: Brassow Jonathan @ 2014-09-03 21:39 UTC (permalink / raw)
  To: NeilBrown; +Cc: Eivind Sarto, linux-raid@vger.kernel.org Raid


On Sep 2, 2014, at 8:45 PM, Brassow Jonathan wrote:

> 
> On Sep 2, 2014, at 8:31 PM, NeilBrown wrote:
> 
>> Thanks for the details Jon.  I don't suppose you have a stack trace from an
>> upstream kernel?  Not that I doubt you have one, and I suspect it is very
>> similar.  But I've learnt not to trust "suspect" and "presume" and "no doubt".
>> 
>> I'll have a look anyway.
> 
> Indeed - me too.  I'll get started on recreating it on the latest upstream kernel now.  It does take several hours to reproduce on average, so I likely won't send the update until tomorrow.
> 
> brassow

The test being run does the following:
1) Writes I/O to an FS which is sitting on a RAID dev (created via LVM, but using MD personalities)
2) The RAID dev is undergoing its initial resync
3) The device in slot 0 suffers a failure.  (I can't be sure at the moment if I've seen this trigger from a device failure from a different slot.)
Sorry the test is so involved.  I haven't been able to create a simple reproducer and the tests I have take quite a while (i.e. a number of iterations) to hit the problem.  These backtraces came after 45min, which is actually really quick compared to what I've seen.

The kernel I used was 3.17.0-rc3.  It is unmodified - I have not applied the patch for reads to potentially resync'ing areas.

 brassow

Sep  3 09:18:36 bp-01 kernel: INFO: task kworker/u130:1:6600 blocked for more than 120 seconds.
Sep  3 09:18:36 bp-01 kernel:      Tainted: G            E  3.17.0-rc3 #1
Sep  3 09:18:36 bp-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep  3 09:18:36 bp-01 kernel: kworker/u130:1  D 0000000000000008     0  6600      2 0x00000080
Sep  3 09:18:36 bp-01 kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:11)
Sep  3 09:18:36 bp-01 kernel: ffff880415b3b868 0000000000000046 ffff880415b3b858 ffff88021727efc0
Sep  3 09:18:36 bp-01 kernel: ffff880415b38010 0000000000012bc0 0000000000012bc0 ffff8804024e4150
Sep  3 09:18:36 bp-01 kernel: ffff880415b3b838 ffff88041ea52bc0 ffff8804024e4150 ffff880415b3b920
Sep  3 09:18:36 bp-01 kernel: Call Trace:
Sep  3 09:18:36 bp-01 kernel: [<ffffffff81580830>] ? yield_to+0x180/0x180
Sep  3 09:18:36 bp-01 kernel: [<ffffffff81580549>] schedule+0x29/0x70
Sep  3 09:18:36 bp-01 kernel: [<ffffffff8158061c>] io_schedule+0x8c/0xd0
Sep  3 09:18:36 bp-01 kernel: [<ffffffff8158085c>] bit_wait_io+0x2c/0x50
Sep  3 09:18:36 bp-01 kernel: [<ffffffff81580946>] __wait_on_bit_lock+0x76/0xb0
Sep  3 09:18:36 bp-01 kernel: [<ffffffff81135057>] __lock_page+0x67/0x70
Sep  3 09:18:36 bp-01 kernel: [<ffffffff8108eb90>] ? wake_atomic_t_function+0x40/0x40
Sep  3 09:18:36 bp-01 kernel: [<ffffffff81141468>] write_cache_pages+0x318/0x510
Sep  3 09:18:36 bp-01 kernel: [<ffffffffa01d8b5a>] ? ext4_writepages+0x5a/0x6a0 [ext4]
Sep  3 09:18:36 bp-01 kernel: [<ffffffff8113ffc0>] ? set_page_dirty+0x60/0x60
Sep  3 09:18:36 bp-01 kernel: [<ffffffff811416b1>] generic_writepages+0x51/0x80
Sep  3 09:18:36 bp-01 kernel: [<ffffffff81141715>] do_writepages+0x35/0x40
Sep  3 09:18:36 bp-01 kernel: [<ffffffff811bfcf9>] __writeback_single_inode+0x49/0x230
Sep  3 09:18:36 bp-01 kernel: [<ffffffff8108eeef>] ? wake_up_bit+0x2f/0x40
Sep  3 09:18:36 bp-01 kernel: [<ffffffff811c3139>] writeback_sb_inodes+0x249/0x360
Sep  3 09:18:36 bp-01 kernel: [<ffffffff8119a415>] ? put_super+0x25/0x40
Sep  3 09:18:36 bp-01 kernel: [<ffffffff811c32ee>] __writeback_inodes_wb+0x9e/0xd0
Sep  3 09:18:36 bp-01 kernel: [<ffffffff811c351b>] wb_writeback+0x1fb/0x2c0
Sep  3 09:18:36 bp-01 kernel: [<ffffffff811c3662>] wb_do_writeback+0x82/0x1f0
Sep  3 09:18:36 bp-01 kernel: [<ffffffff81078fda>] ? ttwu_do_activate.clone.0+0x4a/0x60
Sep  3 09:18:36 bp-01 kernel: [<ffffffff811c3840>] bdi_writeback_workfn+0x70/0x210
Sep  3 09:18:36 bp-01 kernel: [<ffffffff8106b762>] process_one_work+0x182/0x450
Sep  3 09:18:36 bp-01 kernel: [<ffffffff8106bb4f>] worker_thread+0x11f/0x3c0
Sep  3 09:18:36 bp-01 kernel: [<ffffffff8106ba30>] ? process_one_work+0x450/0x450
Sep  3 09:18:36 bp-01 kernel: [<ffffffff8107083e>] kthread+0xce/0xf0
Sep  3 09:18:36 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
Sep  3 09:18:36 bp-01 kernel: [<ffffffff8158412c>] ret_from_fork+0x7c/0xb0
Sep  3 09:18:36 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
Sep  3 09:18:36 bp-01 kernel: INFO: task kworker/0:4:8765 blocked for more than 120 seconds.
Sep  3 09:18:36 bp-01 kernel:      Tainted: G            E  3.17.0-rc3 #1
Sep  3 09:18:36 bp-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep  3 09:18:36 bp-01 kernel: kworker/0:4     D 0000000000000000     0  8765      2 0x00000080
Sep  3 09:18:36 bp-01 kernel: Workqueue: kcopyd do_work [dm_mod]
Sep  3 09:18:36 bp-01 kernel: ffff8802102ab898 0000000000000046 0000000000000000 ffffffff81a19480
Sep  3 09:18:36 bp-01 kernel: ffff8802102a8010 0000000000012bc0 0000000000012bc0 ffff880215950010
Sep  3 09:18:36 bp-01 kernel: ffff8802102ab898 ffff88040171e100 ffff88040171e178 ffff8802102ab8c8
Sep  3 09:18:36 bp-01 kernel: Call Trace:
Sep  3 09:18:36 bp-01 kernel: [<ffffffff81580549>] schedule+0x29/0x70
Sep  3 09:18:36 bp-01 kernel: [<ffffffffa0389e99>] wait_barrier+0xb9/0x260 [raid1]
Sep  3 09:18:36 bp-01 kernel: [<ffffffff8108eb00>] ? bit_waitqueue+0xe0/0xe0
Sep  3 09:18:36 bp-01 kernel: [<ffffffff8108ee13>] ? __wake_up+0x53/0x70
Sep  3 09:18:36 bp-01 kernel: [<ffffffffa038caaa>] make_request+0x9a/0xc00 [raid1]
Sep  3 09:18:37 bp-01 lvm[2109]: device-mapper: waitevent ioctl on (LVM-PSbaO1Zwzw2wYurZrkrTkI5grmcS3IQ3ot9VJPwplnllFf45AU5vBkbtjfLU0adG) failed: Interrupted system call
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81137f60>] ? mempool_alloc+0x60/0x170
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81186ea8>] ? cache_alloc_refill+0x98/0x2c0
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03a4018>] raid_map+0x18/0x20 [dm_raid]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa000336a>] __map_bio+0x4a/0x120 [dm_mod]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0003723>] __clone_and_map_data_bio+0x113/0x130 [dm_mod]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa00037ac>] __split_and_process_non_flush+0x6c/0xb0 [dm_mod]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0003991>] __split_and_process_bio+0x1a1/0x200 [dm_mod]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0003b12>] _dm_request+0x122/0x190 [dm_mod]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0003ba8>] dm_request+0x28/0x40 [dm_mod]
Sep  3 09:18:37 bp-01 kernel: [<ffffffff812693b0>] generic_make_request+0xc0/0x100
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03ba00c>] pending_complete+0x13c/0x2a0 [dm_snapshot]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03ba29e>] commit_callback+0xe/0x10 [dm_snapshot]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03bcea8>] persistent_commit_exception+0xb8/0x160 [dm_snapshot]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03ba1a3>] complete_exception+0x33/0x50 [dm_snapshot]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03ba26a>] copy_callback+0xaa/0xd0 [dm_snapshot]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03ba1c0>] ? complete_exception+0x50/0x50 [dm_snapshot]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa000b788>] run_complete_job+0x98/0xe0 [dm_mod]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa000b21b>] process_jobs+0x5b/0x100 [dm_mod]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa000b6f0>] ? dispatch_job+0x80/0x80 [dm_mod]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa000b2ff>] do_work+0x3f/0x80 [dm_mod]
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8106b762>] process_one_work+0x182/0x450
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8106bb4f>] worker_thread+0x11f/0x3c0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8106ba30>] ? process_one_work+0x450/0x450
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8107083e>] kthread+0xce/0xf0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8158412c>] ret_from_fork+0x7c/0xb0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
Sep  3 09:18:37 bp-01 kernel: INFO: task kjournald:12268 blocked for more than 120 seconds.
Sep  3 09:18:37 bp-01 kernel:      Tainted: G            E  3.17.0-rc3 #1
Sep  3 09:18:37 bp-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep  3 09:18:37 bp-01 kernel: kjournald       D 0000000000000000     0 12268      2 0x00000080
Sep  3 09:18:37 bp-01 kernel: ffff88040d373c08 0000000000000046 ffff88040d373bd8 ffffffff81a19480
Sep  3 09:18:37 bp-01 kernel: ffff88040d370010 0000000000012bc0 0000000000012bc0 ffff8804153a0e80
Sep  3 09:18:37 bp-01 kernel: ffff88040d373c08 ffff88021fa12bc0 ffff8804153a0e80 ffffffff81580830
Sep  3 09:18:37 bp-01 kernel: Call Trace:
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580830>] ? yield_to+0x180/0x180
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580549>] schedule+0x29/0x70
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8158061c>] io_schedule+0x8c/0xd0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8158085c>] bit_wait_io+0x2c/0x50
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580b45>] __wait_on_bit+0x65/0x90
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580830>] ? yield_to+0x180/0x180
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580be8>] out_of_line_wait_on_bit+0x78/0x90
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8108eb90>] ? wake_atomic_t_function+0x40/0x40
Sep  3 09:18:37 bp-01 kernel: [<ffffffff811ca14e>] __wait_on_buffer+0x2e/0x30
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0448372>] journal_commit_transaction+0x872/0xf80 [jbd]
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8108847f>] ? put_prev_entity+0x2f/0x400
Sep  3 09:18:37 bp-01 kernel: [<ffffffff810b21bb>] ? try_to_del_timer_sync+0x5b/0x70
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa044bae1>] kjournald+0xf1/0x270 [jbd]
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8108eb00>] ? bit_waitqueue+0xe0/0xe0
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa044b9f0>] ? commit_timeout+0x10/0x10 [jbd]
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8107083e>] kthread+0xce/0xf0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8158412c>] ret_from_fork+0x7c/0xb0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
Sep  3 09:18:37 bp-01 kernel: INFO: task xdoio:12364 blocked for more than 120 seconds.
Sep  3 09:18:37 bp-01 kernel:      Tainted: G            E  3.17.0-rc3 #1
Sep  3 09:18:37 bp-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep  3 09:18:37 bp-01 kernel: xdoio           D 0000000000000005     0 12364  12363 0x00000080
Sep  3 09:18:37 bp-01 kernel: ffff8804037af978 0000000000000082 ffff8804037af948 ffff88021726c1d0
Sep  3 09:18:37 bp-01 kernel: ffff8804037ac010 0000000000012bc0 0000000000012bc0 ffff8804063a0050
Sep  3 09:18:37 bp-01 kernel: ffff8804037af978 ffff88021fab2bc0 ffff8804063a0050 ffff8804037afa30
Sep  3 09:18:37 bp-01 kernel: Call Trace:
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580830>] ? yield_to+0x180/0x180
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580549>] schedule+0x29/0x70
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8158061c>] io_schedule+0x8c/0xd0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8158085c>] bit_wait_io+0x2c/0x50
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580946>] __wait_on_bit_lock+0x76/0xb0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580830>] ? yield_to+0x180/0x180
Sep  3 09:18:37 bp-01 kernel: [<ffffffff815809f8>] out_of_line_wait_on_bit_lock+0x78/0x90
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8108eb90>] ? wake_atomic_t_function+0x40/0x40
Sep  3 09:18:37 bp-01 kernel: [<ffffffff811ca36e>] __lock_buffer+0x2e/0x30
Sep  3 09:18:37 bp-01 kernel: [<ffffffff811cb520>] __sync_dirty_buffer+0xa0/0xd0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff811cb563>] sync_dirty_buffer+0x13/0x20
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0446e12>] journal_dirty_data+0x1f2/0x260 [jbd]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0464200>] ext3_journal_dirty_data+0x20/0x50 [ext3]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0464255>] journal_dirty_data_fn+0x25/0x30 [ext3]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa04638d4>] walk_page_buffers+0x84/0xc0 [ext3]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0464230>] ? ext3_journal_dirty_data+0x50/0x50 [ext3]
Sep  3 09:18:37 bp-01 kernel: [<ffffffffa046797f>] ext3_ordered_write_end+0xaf/0x1e0 [ext3]
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81134b42>] generic_perform_write+0x112/0x1c0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81137aaf>] __generic_file_write_iter+0x18f/0x390
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81137cfe>] generic_file_write_iter+0x4e/0xd0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81197347>] do_iter_readv_writev+0x77/0xb0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff8122e483>] ? security_file_permission+0x23/0x90
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81137cb0>] ? __generic_file_write_iter+0x390/0x390
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81137cb0>] ? __generic_file_write_iter+0x390/0x390
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81198a60>] do_readv_writev+0xd0/0x320
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81137cb0>] ? __generic_file_write_iter+0x390/0x390
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81197400>] ? do_sync_readv_writev+0x80/0x80
Sep  3 09:18:37 bp-01 kernel: [<ffffffff810c724f>] ? do_futex+0xaf/0x1b0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81198cf1>] vfs_writev+0x41/0x50
Sep  3 09:18:37 bp-01 kernel: [<ffffffff81198e26>] SyS_writev+0x56/0xf0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff810ec456>] ? __audit_syscall_exit+0x216/0x2c0
Sep  3 09:18:37 bp-01 kernel: [<ffffffff815841d2>] system_call_fastpath+0x16/0x1b


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: raid1 data corruption during resync
  2014-09-03 21:39             ` Brassow Jonathan
@ 2014-09-04  5:28               ` NeilBrown
  0 siblings, 0 replies; 17+ messages in thread
From: NeilBrown @ 2014-09-04  5:28 UTC (permalink / raw)
  To: Brassow Jonathan; +Cc: Eivind Sarto, linux-raid@vger.kernel.org Raid

[-- Attachment #1: Type: text/plain, Size: 14911 bytes --]

On Wed, 3 Sep 2014 16:39:52 -0500 Brassow Jonathan <jbrassow@redhat.com>
wrote:

> 
> On Sep 2, 2014, at 8:45 PM, Brassow Jonathan wrote:
> 
> > 
> > On Sep 2, 2014, at 8:31 PM, NeilBrown wrote:
> > 
> >> Thanks for the details Jon.  I don't suppose you have a stack trace from an
> >> upstream kernel?  Not that I doubt you have one, and I suspect it is very
> >> similar.  But I've learnt not to trust "suspect" and "presume" and "no doubt".
> >> 
> >> I'll have a look anyway.
> > 
> > Indeed - me too.  I'll get started on recreating it on the latest upstream kernel now.  It does take several hours to reproduce on average, so I likely won't send the update until tomorrow.
> > 
> > brassow
> 
> The test being run does the following:
> 1) Writes I/O to an FS which is sitting on a RAID dev (created via LVM, but using MD personalities)
> 2) The RAID dev is undergoing its initial resync
> 3) The device in slot 0 suffers a failure.  (I can't be sure at the moment if I've seen this trigger from a device failure from a different slot.)
> Sorry the test is so involved.  I haven't been able to create a simple reproducer and the tests I have take quite a while (i.e. a number of iterations) to hit the problem.  These backtraces came after 45min, which is actually really quick compared to what I've seen.
> 
> The kernel I used was 3.17.0-rc3.  It is unmodified - I have not applied the patch for reads to potentially resync'ing areas.
> 
>  brassow
> 
> Sep  3 09:18:36 bp-01 kernel: INFO: task kworker/u130:1:6600 blocked for more than 120 seconds.
> Sep  3 09:18:36 bp-01 kernel:      Tainted: G            E  3.17.0-rc3 #1
> Sep  3 09:18:36 bp-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep  3 09:18:36 bp-01 kernel: kworker/u130:1  D 0000000000000008     0  6600      2 0x00000080
> Sep  3 09:18:36 bp-01 kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:11)
> Sep  3 09:18:36 bp-01 kernel: ffff880415b3b868 0000000000000046 ffff880415b3b858 ffff88021727efc0
> Sep  3 09:18:36 bp-01 kernel: ffff880415b38010 0000000000012bc0 0000000000012bc0 ffff8804024e4150
> Sep  3 09:18:36 bp-01 kernel: ffff880415b3b838 ffff88041ea52bc0 ffff8804024e4150 ffff880415b3b920
> Sep  3 09:18:36 bp-01 kernel: Call Trace:
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff81580830>] ? yield_to+0x180/0x180
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff81580549>] schedule+0x29/0x70
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff8158061c>] io_schedule+0x8c/0xd0
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff8158085c>] bit_wait_io+0x2c/0x50
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff81580946>] __wait_on_bit_lock+0x76/0xb0
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff81135057>] __lock_page+0x67/0x70
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff8108eb90>] ? wake_atomic_t_function+0x40/0x40
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff81141468>] write_cache_pages+0x318/0x510
> Sep  3 09:18:36 bp-01 kernel: [<ffffffffa01d8b5a>] ? ext4_writepages+0x5a/0x6a0 [ext4]
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff8113ffc0>] ? set_page_dirty+0x60/0x60
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff811416b1>] generic_writepages+0x51/0x80
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff81141715>] do_writepages+0x35/0x40
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff811bfcf9>] __writeback_single_inode+0x49/0x230
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff8108eeef>] ? wake_up_bit+0x2f/0x40
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff811c3139>] writeback_sb_inodes+0x249/0x360
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff8119a415>] ? put_super+0x25/0x40
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff811c32ee>] __writeback_inodes_wb+0x9e/0xd0
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff811c351b>] wb_writeback+0x1fb/0x2c0
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff811c3662>] wb_do_writeback+0x82/0x1f0
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff81078fda>] ? ttwu_do_activate.clone.0+0x4a/0x60
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff811c3840>] bdi_writeback_workfn+0x70/0x210
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff8106b762>] process_one_work+0x182/0x450
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff8106bb4f>] worker_thread+0x11f/0x3c0
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff8106ba30>] ? process_one_work+0x450/0x450
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff8107083e>] kthread+0xce/0xf0
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff8158412c>] ret_from_fork+0x7c/0xb0
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
> Sep  3 09:18:36 bp-01 kernel: INFO: task kworker/0:4:8765 blocked for more than 120 seconds.
> Sep  3 09:18:36 bp-01 kernel:      Tainted: G            E  3.17.0-rc3 #1
> Sep  3 09:18:36 bp-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep  3 09:18:36 bp-01 kernel: kworker/0:4     D 0000000000000000     0  8765      2 0x00000080
> Sep  3 09:18:36 bp-01 kernel: Workqueue: kcopyd do_work [dm_mod]
> Sep  3 09:18:36 bp-01 kernel: ffff8802102ab898 0000000000000046 0000000000000000 ffffffff81a19480
> Sep  3 09:18:36 bp-01 kernel: ffff8802102a8010 0000000000012bc0 0000000000012bc0 ffff880215950010
> Sep  3 09:18:36 bp-01 kernel: ffff8802102ab898 ffff88040171e100 ffff88040171e178 ffff8802102ab8c8
> Sep  3 09:18:36 bp-01 kernel: Call Trace:
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff81580549>] schedule+0x29/0x70
> Sep  3 09:18:36 bp-01 kernel: [<ffffffffa0389e99>] wait_barrier+0xb9/0x260 [raid1]
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff8108eb00>] ? bit_waitqueue+0xe0/0xe0
> Sep  3 09:18:36 bp-01 kernel: [<ffffffff8108ee13>] ? __wake_up+0x53/0x70
> Sep  3 09:18:36 bp-01 kernel: [<ffffffffa038caaa>] make_request+0x9a/0xc00 [raid1]
> Sep  3 09:18:37 bp-01 lvm[2109]: device-mapper: waitevent ioctl on (LVM-PSbaO1Zwzw2wYurZrkrTkI5grmcS3IQ3ot9VJPwplnllFf45AU5vBkbtjfLU0adG) failed: Interrupted system call
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81137f60>] ? mempool_alloc+0x60/0x170
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81186ea8>] ? cache_alloc_refill+0x98/0x2c0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03a4018>] raid_map+0x18/0x20 [dm_raid]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa000336a>] __map_bio+0x4a/0x120 [dm_mod]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0003723>] __clone_and_map_data_bio+0x113/0x130 [dm_mod]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa00037ac>] __split_and_process_non_flush+0x6c/0xb0 [dm_mod]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0003991>] __split_and_process_bio+0x1a1/0x200 [dm_mod]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0003b12>] _dm_request+0x122/0x190 [dm_mod]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0003ba8>] dm_request+0x28/0x40 [dm_mod]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff812693b0>] generic_make_request+0xc0/0x100
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03ba00c>] pending_complete+0x13c/0x2a0 [dm_snapshot]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03ba29e>] commit_callback+0xe/0x10 [dm_snapshot]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03bcea8>] persistent_commit_exception+0xb8/0x160 [dm_snapshot]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03ba1a3>] complete_exception+0x33/0x50 [dm_snapshot]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03ba26a>] copy_callback+0xaa/0xd0 [dm_snapshot]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa03ba1c0>] ? complete_exception+0x50/0x50 [dm_snapshot]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa000b788>] run_complete_job+0x98/0xe0 [dm_mod]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa000b21b>] process_jobs+0x5b/0x100 [dm_mod]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa000b6f0>] ? dispatch_job+0x80/0x80 [dm_mod]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa000b2ff>] do_work+0x3f/0x80 [dm_mod]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8106b762>] process_one_work+0x182/0x450
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8106bb4f>] worker_thread+0x11f/0x3c0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8106ba30>] ? process_one_work+0x450/0x450
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8107083e>] kthread+0xce/0xf0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8158412c>] ret_from_fork+0x7c/0xb0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
> Sep  3 09:18:37 bp-01 kernel: INFO: task kjournald:12268 blocked for more than 120 seconds.
> Sep  3 09:18:37 bp-01 kernel:      Tainted: G            E  3.17.0-rc3 #1
> Sep  3 09:18:37 bp-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep  3 09:18:37 bp-01 kernel: kjournald       D 0000000000000000     0 12268      2 0x00000080
> Sep  3 09:18:37 bp-01 kernel: ffff88040d373c08 0000000000000046 ffff88040d373bd8 ffffffff81a19480
> Sep  3 09:18:37 bp-01 kernel: ffff88040d370010 0000000000012bc0 0000000000012bc0 ffff8804153a0e80
> Sep  3 09:18:37 bp-01 kernel: ffff88040d373c08 ffff88021fa12bc0 ffff8804153a0e80 ffffffff81580830
> Sep  3 09:18:37 bp-01 kernel: Call Trace:
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580830>] ? yield_to+0x180/0x180
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580549>] schedule+0x29/0x70
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8158061c>] io_schedule+0x8c/0xd0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8158085c>] bit_wait_io+0x2c/0x50
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580b45>] __wait_on_bit+0x65/0x90
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580830>] ? yield_to+0x180/0x180
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580be8>] out_of_line_wait_on_bit+0x78/0x90
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8108eb90>] ? wake_atomic_t_function+0x40/0x40
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff811ca14e>] __wait_on_buffer+0x2e/0x30
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0448372>] journal_commit_transaction+0x872/0xf80 [jbd]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8108847f>] ? put_prev_entity+0x2f/0x400
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff810b21bb>] ? try_to_del_timer_sync+0x5b/0x70
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa044bae1>] kjournald+0xf1/0x270 [jbd]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8108eb00>] ? bit_waitqueue+0xe0/0xe0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa044b9f0>] ? commit_timeout+0x10/0x10 [jbd]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8107083e>] kthread+0xce/0xf0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8158412c>] ret_from_fork+0x7c/0xb0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81070770>] ? kthread_freezable_should_stop+0x70/0x70
> Sep  3 09:18:37 bp-01 kernel: INFO: task xdoio:12364 blocked for more than 120 seconds.
> Sep  3 09:18:37 bp-01 kernel:      Tainted: G            E  3.17.0-rc3 #1
> Sep  3 09:18:37 bp-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep  3 09:18:37 bp-01 kernel: xdoio           D 0000000000000005     0 12364  12363 0x00000080
> Sep  3 09:18:37 bp-01 kernel: ffff8804037af978 0000000000000082 ffff8804037af948 ffff88021726c1d0
> Sep  3 09:18:37 bp-01 kernel: ffff8804037ac010 0000000000012bc0 0000000000012bc0 ffff8804063a0050
> Sep  3 09:18:37 bp-01 kernel: ffff8804037af978 ffff88021fab2bc0 ffff8804063a0050 ffff8804037afa30
> Sep  3 09:18:37 bp-01 kernel: Call Trace:
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580830>] ? yield_to+0x180/0x180
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580549>] schedule+0x29/0x70
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8158061c>] io_schedule+0x8c/0xd0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8158085c>] bit_wait_io+0x2c/0x50
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580946>] __wait_on_bit_lock+0x76/0xb0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81580830>] ? yield_to+0x180/0x180
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff815809f8>] out_of_line_wait_on_bit_lock+0x78/0x90
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8108eb90>] ? wake_atomic_t_function+0x40/0x40
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff811ca36e>] __lock_buffer+0x2e/0x30
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff811cb520>] __sync_dirty_buffer+0xa0/0xd0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff811cb563>] sync_dirty_buffer+0x13/0x20
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0446e12>] journal_dirty_data+0x1f2/0x260 [jbd]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0464200>] ext3_journal_dirty_data+0x20/0x50 [ext3]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0464255>] journal_dirty_data_fn+0x25/0x30 [ext3]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa04638d4>] walk_page_buffers+0x84/0xc0 [ext3]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa0464230>] ? ext3_journal_dirty_data+0x50/0x50 [ext3]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffffa046797f>] ext3_ordered_write_end+0xaf/0x1e0 [ext3]
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81134b42>] generic_perform_write+0x112/0x1c0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81137aaf>] __generic_file_write_iter+0x18f/0x390
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81137cfe>] generic_file_write_iter+0x4e/0xd0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81197347>] do_iter_readv_writev+0x77/0xb0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff8122e483>] ? security_file_permission+0x23/0x90
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81137cb0>] ? __generic_file_write_iter+0x390/0x390
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81137cb0>] ? __generic_file_write_iter+0x390/0x390
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81198a60>] do_readv_writev+0xd0/0x320
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81137cb0>] ? __generic_file_write_iter+0x390/0x390
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81197400>] ? do_sync_readv_writev+0x80/0x80
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff810c724f>] ? do_futex+0xaf/0x1b0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81198cf1>] vfs_writev+0x41/0x50
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff81198e26>] SyS_writev+0x56/0xf0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff810ec456>] ? __audit_syscall_exit+0x216/0x2c0
> Sep  3 09:18:37 bp-01 kernel: [<ffffffff815841d2>] system_call_fastpath+0x16/0x1b


Thanks a lot.  I love having concrete data....

I had a theory.  Then  another.  None of them have really stood up yet.  But
there are definitely problems with the code.

1/ When the resync aborts close_sync() may set ->start_next_window to
  MaxSector without merging next_window_requests into current_window_requests.
  So those numbers can be wrong.

2/ raise_barrier() cannot detect if there are any pending requests in the
  region it is about to resync.  That can lead to corruption!!

Neither of these explain the hang you are seeing.
I note that the "md0-resync" thread isn't listed.  I don't suppose you know
what it was doing (stack trace)??
Also, had  "md: md0: sync done" appeared in syslog yet?

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-09-09  1:08 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <D4FE2D75-4208-48C9-A4D0-432F092E5AE9@redhat.com>
2014-09-08 15:52 ` Fwd: raid1 data corruption during resync Brassow Jonathan
2014-09-09  1:08   ` NeilBrown
2014-08-29 19:29 Eivind Sarto
2014-09-02 14:10 ` Brassow Jonathan
2014-09-02 16:43   ` Eivind Sarto
2014-09-02 17:04   ` Eivind Sarto
2014-09-02 16:59 ` Brassow Jonathan
2014-09-02 19:24 ` Brassow Jonathan
2014-09-02 22:07   ` Eivind Sarto
2014-09-02 22:14     ` Brassow Jonathan
2014-09-02 23:55     ` NeilBrown
2014-09-03  0:48       ` Eivind Sarto
2014-09-03  1:18       ` Brassow Jonathan
2014-09-03  1:31         ` NeilBrown
2014-09-03  1:45           ` Brassow Jonathan
2014-09-03 21:39             ` Brassow Jonathan
2014-09-04  5:28               ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).