linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* dmesg deluge: RAID1 conf printout
@ 2012-04-21 13:55 Jan Ceuleers
  2012-04-21 21:00 ` NeilBrown
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Ceuleers @ 2012-04-21 13:55 UTC (permalink / raw)
  To: Linux RAID

Hi

Ever since I installed a 3.2 kernel on my Ubuntu  11.04 system, dmesg 
began being flooed with messages such as the following:

[15746.465106]  disk 1, wo:0, o:1, dev:sdb2
[15748.100302] RAID1 conf printout:
[15748.100306]  --- wd:2 rd:2
[15748.100310]  disk 0, wo:0, o:1, dev:sdd2
[15748.100313]  disk 1, wo:0, o:1, dev:sdb2
[15748.438638] RAID1 conf printout:
[15748.438642]  --- wd:2 rd:2
[15748.438646]  disk 0, wo:0, o:1, dev:sdd2
[15748.438649]  disk 1, wo:0, o:1, dev:sdb2
[15751.181020] RAID1 conf printout:
[15751.181026]  --- wd:2 rd:2
[15751.181030]  disk 0, wo:0, o:1, dev:sdd2
[15751.181034]  disk 1, wo:0, o:1, dev:sdb2
[15751.630244] RAID1 conf printout:
[15751.630250]  --- wd:2 rd:2
[15751.630255]  disk 0, wo:0, o:1, dev:sdd2
[15751.630259]  disk 1, wo:0, o:1, dev:sdb2
[15754.004658] RAID1 conf printout:
[15754.004662]  --- wd:2 rd:2
[15754.004666]  disk 0, wo:0, o:1, dev:sdd2
[15754.004670]  disk 1, wo:0, o:1, dev:sdb2
[15754.312749] RAID1 conf printout:
[15754.312754]  --- wd:2 rd:2
[15754.312758]  disk 0, wo:0, o:1, dev:sdd2
[15754.312762]  disk 1, wo:0, o:1, dev:sdb2
[15759.231107] RAID1 conf printout:
[15759.231112]  --- wd:2 rd:2
[15759.231115]  disk 0, wo:0, o:1, dev:sdd2
[15759.231118]  disk 1, wo:0, o:1, dev:sdb2

(etc, ad infinitum)

I'm now running a 3.3 kernel and it's still happening. My RAID sets are 
working just fine; it's just this nuisance log spamming that's annoying.

Anything I can do about that?

root@zotac:~# uname -a
Linux zotac 3.3.2-030302-generic #201204131335 SMP Fri Apr 13 17:36:17 
UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
root@zotac:~# mdadm -V
mdadm - v3.1.4 - 31st August 2010


Thanks, Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dmesg deluge: RAID1 conf printout
  2012-04-21 13:55 dmesg deluge: RAID1 conf printout Jan Ceuleers
@ 2012-04-21 21:00 ` NeilBrown
  2012-04-22  8:17   ` Jan Ceuleers
  2012-04-22 17:21   ` Jan Ceuleers
  0 siblings, 2 replies; 7+ messages in thread
From: NeilBrown @ 2012-04-21 21:00 UTC (permalink / raw)
  To: Jan Ceuleers; +Cc: Linux RAID

[-- Attachment #1: Type: text/plain, Size: 2640 bytes --]

On Sat, 21 Apr 2012 15:55:15 +0200 Jan Ceuleers <jan.ceuleers@computer.org>
wrote:

> Hi
> 
> Ever since I installed a 3.2 kernel on my Ubuntu  11.04 system, dmesg 
> began being flooed with messages such as the following:
> 
> [15746.465106]  disk 1, wo:0, o:1, dev:sdb2
> [15748.100302] RAID1 conf printout:
> [15748.100306]  --- wd:2 rd:2
> [15748.100310]  disk 0, wo:0, o:1, dev:sdd2
> [15748.100313]  disk 1, wo:0, o:1, dev:sdb2
> [15748.438638] RAID1 conf printout:
> [15748.438642]  --- wd:2 rd:2
> [15748.438646]  disk 0, wo:0, o:1, dev:sdd2
> [15748.438649]  disk 1, wo:0, o:1, dev:sdb2
> [15751.181020] RAID1 conf printout:
> [15751.181026]  --- wd:2 rd:2
> [15751.181030]  disk 0, wo:0, o:1, dev:sdd2
> [15751.181034]  disk 1, wo:0, o:1, dev:sdb2
> [15751.630244] RAID1 conf printout:
> [15751.630250]  --- wd:2 rd:2
> [15751.630255]  disk 0, wo:0, o:1, dev:sdd2
> [15751.630259]  disk 1, wo:0, o:1, dev:sdb2
> [15754.004658] RAID1 conf printout:
> [15754.004662]  --- wd:2 rd:2
> [15754.004666]  disk 0, wo:0, o:1, dev:sdd2
> [15754.004670]  disk 1, wo:0, o:1, dev:sdb2
> [15754.312749] RAID1 conf printout:
> [15754.312754]  --- wd:2 rd:2
> [15754.312758]  disk 0, wo:0, o:1, dev:sdd2
> [15754.312762]  disk 1, wo:0, o:1, dev:sdb2
> [15759.231107] RAID1 conf printout:
> [15759.231112]  --- wd:2 rd:2
> [15759.231115]  disk 0, wo:0, o:1, dev:sdd2
> [15759.231118]  disk 1, wo:0, o:1, dev:sdb2
> 
> (etc, ad infinitum)
> 
> I'm now running a 3.3 kernel and it's still happening. My RAID sets are 
> working just fine; it's just this nuisance log spamming that's annoying.
> 
> Anything I can do about that?

Best thing you can do it report it and hope some maintainer notices and helps
out.   Oh wait, you did that :-)

Looks like:

   commit 7bfec5f35c68121e7b1849f3f4166dd96c8da5b3

is at fault.  It causes md to attempt to add spares into the array more often.
Would I be right in guessing that you have one spare in this array?
If you remove the spare, the messages should stop.

I think I know how I'll fix it but it'll have to wait for tomorrow.  Then
I'll send you a patch to test.

Thanks for the report,
NeilBrown

> 
> root@zotac:~# uname -a
> Linux zotac 3.3.2-030302-generic #201204131335 SMP Fri Apr 13 17:36:17 
> UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
> root@zotac:~# mdadm -V
> mdadm - v3.1.4 - 31st August 2010
> 
> 
> Thanks, Jan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dmesg deluge: RAID1 conf printout
  2012-04-21 21:00 ` NeilBrown
@ 2012-04-22  8:17   ` Jan Ceuleers
  2012-04-22 17:21   ` Jan Ceuleers
  1 sibling, 0 replies; 7+ messages in thread
From: Jan Ceuleers @ 2012-04-22  8:17 UTC (permalink / raw)
  To: NeilBrown; +Cc: Linux RAID

NeilBrown wrote:
>> Anything I can do about that?
>
> Best thing you can do it report it and hope some maintainer notices and helps
> out.   Oh wait, you did that :-)
>
> Looks like:
>
>     commit 7bfec5f35c68121e7b1849f3f4166dd96c8da5b3
>
> is at fault.  It causes md to attempt to add spares into the array more often.
> Would I be right in guessing that you have one spare in this array?
> If you remove the spare, the messages should stop.

Hi Neil.

As ever many thanks for your responsiveness and helpfulness.

I have two RAID1 sets, each with one spare:

root@zotac:~# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] 
[raid4] [raid10]
md0 : active raid1 sdb1[1] sde2[2](S) sdd1[0]
       521984 blocks [2/2] [UU]
       bitmap: 0/1 pages [0KB], 65536KB chunk

md1 : active raid1 sdb2[1] sde3[2](S) sdd2[0]
       487861824 blocks [2/2] [UU]
       bitmap: 1/4 pages [4KB], 65536KB chunk

unused devices: <none>

As you can see, there are no spares to add. Upon my last reboot the 
spares were added automatically. Having said that, upon the previous 
reboot I had to add them manually. The devices in question were also 
missing from /dev/disk/by-uuid. This may be an entirely different 
problem though, and not one I'm worried about given the mismatch between 
the kernel version and the rest of the distro.

I'm only running this recent a kernel because its support of a 
particular piece of hardware I have seems to be much more stable than 
with my distro's regular kernel.

In case it's important:

root@zotac:~# cat /etc/mdadm/mdadm.conf | grep -v ^#

DEVICE partitions

CREATE owner=root group=disk mode=0660 auto=yes

HOMEHOST <system>

MAILADDR root

ARRAY /dev/md0 level=raid1 num-devices=2 spares=1 
UUID=c2bfb80c:0cbf5cd9:67c62432:ead2ec0c
ARRAY /dev/md1 level=raid1 num-devices=2 spares=1 
UUID=442e9934:97191d8e:6d0cf7a9:41621837

> I think I know how I'll fix it but it'll have to wait for tomorrow.  Then
> I'll send you a patch to test.

I'll have to set up a build environment, but I'll certainly give that a go.

Thanks, Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dmesg deluge: RAID1 conf printout
  2012-04-21 21:00 ` NeilBrown
  2012-04-22  8:17   ` Jan Ceuleers
@ 2012-04-22 17:21   ` Jan Ceuleers
  2012-04-22 23:48     ` NeilBrown
  1 sibling, 1 reply; 7+ messages in thread
From: Jan Ceuleers @ 2012-04-22 17:21 UTC (permalink / raw)
  To: NeilBrown; +Cc: Linux RAID

NeilBrown wrote:
> Looks like:
>
>     commit 7bfec5f35c68121e7b1849f3f4166dd96c8da5b3
>
> is at fault.  It causes md to attempt to add spares into the array more often.
> Would I be right in guessing that you have one spare in this array?
> If you remove the spare, the messages should stop.

Hmmm. The commit message is as follows:

commit 7bfec5f35c68121e7b1849f3f4166dd96c8da5b3
Author: NeilBrown <neilb@suse.de>
Date:   Fri Dec 23 10:17:53 2011 +1100

     md/raid5: If there is a spare and a want_replacement device, start 
replaceme

     When attempting to add a spare to a RAID[456] array, also consider
     adding it as a replacement for a want_replacement device.

     This requires that common md code attempt hot_add even when the array
     is not formally degraded.

     Reviewed-by: Dan Williams <dan.j.williams@intel.com>
     Signed-off-by: NeilBrown <neilb@suse.de>


Does this also apply to RAID1 (which is all I've got on this machine: no 
RAID456)?

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dmesg deluge: RAID1 conf printout
  2012-04-22 17:21   ` Jan Ceuleers
@ 2012-04-22 23:48     ` NeilBrown
  2012-04-23  7:01       ` Jan Ceuleers
  0 siblings, 1 reply; 7+ messages in thread
From: NeilBrown @ 2012-04-22 23:48 UTC (permalink / raw)
  To: Jan Ceuleers; +Cc: Linux RAID

[-- Attachment #1: Type: text/plain, Size: 3267 bytes --]

On Sun, 22 Apr 2012 19:21:11 +0200 Jan Ceuleers <jan.ceuleers@computer.org>
wrote:

> NeilBrown wrote:
> > Looks like:
> >
> >     commit 7bfec5f35c68121e7b1849f3f4166dd96c8da5b3
> >
> > is at fault.  It causes md to attempt to add spares into the array more often.
> > Would I be right in guessing that you have one spare in this array?
> > If you remove the spare, the messages should stop.
> 
> Hmmm. The commit message is as follows:
> 
> commit 7bfec5f35c68121e7b1849f3f4166dd96c8da5b3
> Author: NeilBrown <neilb@suse.de>
> Date:   Fri Dec 23 10:17:53 2011 +1100
> 
>      md/raid5: If there is a spare and a want_replacement device, start 
> replaceme
> 
>      When attempting to add a spare to a RAID[456] array, also consider
>      adding it as a replacement for a want_replacement device.
> 
>      This requires that common md code attempt hot_add even when the array
>      is not formally degraded.
> 
>      Reviewed-by: Dan Williams <dan.j.williams@intel.com>
>      Signed-off-by: NeilBrown <neilb@suse.de>
> 
> 
> Does this also apply to RAID1 (which is all I've got on this machine: no 
> RAID456)?

Yes it does apply to RAID1.  Part of the patch was RAID5-specific but part of
it was to common code that would affect other levels.  That part was not
meant to be a big change, but it turned out to be a little bigger than I
expected.

The following should fix it.

Thanks again for the report,
NeilBrown


From 321f820a905993f694f7ba4347492e9273831813 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Mon, 23 Apr 2012 09:46:28 +1000
Subject: [PATCH] md: don't call ->add_disk unless there is good reason.

Commit 7bfec5f35c68121e7b18

   md/raid5: If there is a spare and a want_replacement device, start replacement.

cause md_check_recovery to call ->add_disk much more often.
Instead of only when the array is degraded, it is now called whenever
md_check_recovery finds anything useful to do, which includes
updating the metadata for clean<->dirty transition.
This causes unnecessary work, and causes info messages from ->add_disk
to be reported much too often.

So refine md_check_recovery to only do any actual recovery checking
(including ->add_disk) if MD_RECOVERY_NEEDED is set.

This fix is suitable for 3.3.y:

Cc: stable@vger.kernel.org
Reported-by: Jan Ceuleers <jan.ceuleers@computer.org>
Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 9524192..47f1fdb6 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7813,14 +7813,14 @@ void md_check_recovery(struct mddev *mddev)
 		 * any transients in the value of "sync_action".
 		 */
 		set_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
-		clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
 		/* Clear some bits that don't mean anything, but
 		 * might be left set
 		 */
 		clear_bit(MD_RECOVERY_INTR, &mddev->recovery);
 		clear_bit(MD_RECOVERY_DONE, &mddev->recovery);
 
-		if (test_bit(MD_RECOVERY_FROZEN, &mddev->recovery))
+		if (!test_and_clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
+		    test_bit(MD_RECOVERY_FROZEN, &mddev->recovery))
 			goto unlock;
 		/* no recovery is running.
 		 * remove any failed drives, then

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: dmesg deluge: RAID1 conf printout
  2012-04-22 23:48     ` NeilBrown
@ 2012-04-23  7:01       ` Jan Ceuleers
  2012-04-27  1:30         ` NeilBrown
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Ceuleers @ 2012-04-23  7:01 UTC (permalink / raw)
  To: NeilBrown; +Cc: Linux RAID

NeilBrown wrote:
> The following should fix it.

Neil,

Since I had not been running a self-compiled kernel when I reported 
this, I first reproduced the issue on vanilla 3.3.0: git clone, git 
checkout, build (using the Ubuntu config file for 3.3.0), install, boot.

The spares were not being added to either RAID 1 set, and the logs were 
not being flooded. The behaviour however began immediately when I added 
the spares manually.

I then built a kernel with your patch on top, and upon booting the 
spares were being added automatically, and the logs are not being 
spammed. The problem therefore indeed appears to be solved.

Tested-by: Jan Ceuleers <jan.ceuleers@computer.org>

Thanks, Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dmesg deluge: RAID1 conf printout
  2012-04-23  7:01       ` Jan Ceuleers
@ 2012-04-27  1:30         ` NeilBrown
  0 siblings, 0 replies; 7+ messages in thread
From: NeilBrown @ 2012-04-27  1:30 UTC (permalink / raw)
  To: Jan Ceuleers; +Cc: Linux RAID

[-- Attachment #1: Type: text/plain, Size: 1232 bytes --]

On Mon, 23 Apr 2012 09:01:45 +0200 Jan Ceuleers <jan.ceuleers@computer.org>
wrote:

> NeilBrown wrote:
> > The following should fix it.
> 
> Neil,
> 
> Since I had not been running a self-compiled kernel when I reported 
> this, I first reproduced the issue on vanilla 3.3.0: git clone, git 
> checkout, build (using the Ubuntu config file for 3.3.0), install, boot.
> 
> The spares were not being added to either RAID 1 set, and the logs were 
> not being flooded. The behaviour however began immediately when I added 
> the spares manually.
> 
> I then built a kernel with your patch on top, and upon booting the 
> spares were being added automatically, and the logs are not being 
> spammed. The problem therefore indeed appears to be solved.

Thanks for confirming.

> 
> Tested-by: Jan Ceuleers <jan.ceuleers@computer.org>

I had meant to add this when I submitted the patch, but other things pushed
it from my mind.
It is appreciated though.

Thanks,
NeilBrown



> 
> Thanks, Jan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-04-27  1:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-21 13:55 dmesg deluge: RAID1 conf printout Jan Ceuleers
2012-04-21 21:00 ` NeilBrown
2012-04-22  8:17   ` Jan Ceuleers
2012-04-22 17:21   ` Jan Ceuleers
2012-04-22 23:48     ` NeilBrown
2012-04-23  7:01       ` Jan Ceuleers
2012-04-27  1:30         ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).