* constant array_state active after specific jobs @ 2017-03-23 8:46 pdi 2017-03-24 5:25 ` NeilBrown 0 siblings, 1 reply; 6+ messages in thread From: pdi @ 2017-03-23 8:46 UTC (permalink / raw) To: linux-raid Greetings all, The problem in a nutshell is that an array is clean after boot, until some specific jobs switch it to active where it remains until reboot. A similar problem was discussed, and solved, in https://www.spinics.net/lists/raid/msg46450.html. However, AFAICT, it is not the same issue. I would be grateful for any insights as to why this happens and/or how to prevent it. The relevant info follows, please let me know if anything further might help. Many thanks in advance. - uname -a Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 x86_64 Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux - mdadm -V mdadm - v3.3.4 - 3rd August 2015 - Desktop drives without sct/erc, with timeout mismatch correction as per https://raid.wiki.kernel.org/index.php/Timeout_Mismatch - /dev/md9 is a raid10 array, 4 devices, far=2, with various dirs used as samba and nfs shares - The array is in *constant* array_state active - mdadm -D /dev/md9 | grep 'State :' State : active - cat /sys/block/md9/md/array_state active - watch -d 'grep md9 /proc/diskstats' remain unchanged - uptime load average: 0.00, 0.00, 0.00 - cat /sys/block/md9/md/safe_mode_delay 0.201 - echo 0.1 > /sys/block/md9/md/safe_mode_delay array_state remains active - echo clean > /sys/block/md9/md/array_state echo: write error: Device or resource busy - reboot (with or without prior check) array_state clean - After reboot, array remains clean until some specific jobs put it in constant active state. Such jobs so far identified: - echo check > /sys/block/md9/md/sync_action - run an rsnapshot job - start a qemu/kvm vm - Other jobs, like text/doc editing, multimedia playback, etc retain array_state clean ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: constant array_state active after specific jobs 2017-03-23 8:46 constant array_state active after specific jobs pdi @ 2017-03-24 5:25 ` NeilBrown 2017-03-24 7:04 ` pdi 2017-03-27 18:08 ` Shaohua Li 0 siblings, 2 replies; 6+ messages in thread From: NeilBrown @ 2017-03-24 5:25 UTC (permalink / raw) To: pdi, linux-raid; +Cc: Shaohua Li [-- Attachment #1: Type: text/plain, Size: 2196 bytes --] On Thu, Mar 23 2017, pdi wrote: > Greetings all, > > The problem in a nutshell is that an array is clean after boot, until > some specific jobs switch it to active where it remains until reboot. > > A similar problem was discussed, and solved, in > https://www.spinics.net/lists/raid/msg46450.html. However, AFAICT, > it is not the same issue. > > I would be grateful for any insights as to why this happens and/or how > to prevent it. > > The relevant info follows, please let me know if anything further might > help. > > Many thanks in advance. > > - uname -a > Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 x86_64 > Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux > - mdadm -V > mdadm - v3.3.4 - 3rd August 2015 > - Desktop drives without sct/erc, > with timeout mismatch correction as per > https://raid.wiki.kernel.org/index.php/Timeout_Mismatch > - /dev/md9 is a raid10 array, 4 devices, far=2, > with various dirs used as samba and nfs shares > - The array is in *constant* array_state active > - mdadm -D /dev/md9 | grep 'State :' > State : active > - cat /sys/block/md9/md/array_state > active > - watch -d 'grep md9 /proc/diskstats' > remain unchanged > - uptime > load average: 0.00, 0.00, 0.00 > - cat /sys/block/md9/md/safe_mode_delay > 0.201 > - echo 0.1 > /sys/block/md9/md/safe_mode_delay > array_state remains active > - echo clean > /sys/block/md9/md/array_state > echo: write error: Device or resource busy > - reboot (with or without prior check) > array_state clean > - After reboot, array remains clean until some specific > jobs put it in constant active state. Such jobs so far > identified: > - echo check > /sys/block/md9/md/sync_action > - run an rsnapshot job > - start a qemu/kvm vm > - Other jobs, like text/doc editing, multimedia playback, > etc retain array_state clean This bug was introduced by Commit: 20d0189b1012 ("block: Introduce new bio_split()") in 3.14, and fixed by Commit: 9b622e2bbcf0 ("raid10: increment write counter after bio is split") in 4.8. Maybe the latter patch should be sent to -stable ?? NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: constant array_state active after specific jobs 2017-03-24 5:25 ` NeilBrown @ 2017-03-24 7:04 ` pdi 2017-03-26 22:42 ` NeilBrown 2017-03-27 18:08 ` Shaohua Li 1 sibling, 1 reply; 6+ messages in thread From: pdi @ 2017-03-24 7:04 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid, Shaohua Li On Fri, 24 Mar 2017 16:25:35 +1100 NeilBrown <neilb@suse.com> wrote: > On Thu, Mar 23 2017, pdi wrote: > > > Greetings all, > > > > The problem in a nutshell is that an array is clean after boot, > > until some specific jobs switch it to active where it remains until > > reboot. > > > > A similar problem was discussed, and solved, in > > https://www.spinics.net/lists/raid/msg46450.html. However, AFAICT, > > it is not the same issue. > > > > I would be grateful for any insights as to why this happens and/or > > how to prevent it. > > > > The relevant info follows, please let me know if anything further > > might help. > > > > Many thanks in advance. > > > > - uname -a > > Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 x86_64 > > Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux > > - mdadm -V > > mdadm - v3.3.4 - 3rd August 2015 > > - Desktop drives without sct/erc, > > with timeout mismatch correction as per > > https://raid.wiki.kernel.org/index.php/Timeout_Mismatch > > - /dev/md9 is a raid10 array, 4 devices, far=2, > > with various dirs used as samba and nfs shares > > - The array is in *constant* array_state active > > - mdadm -D /dev/md9 | grep 'State :' > > State : active > > - cat /sys/block/md9/md/array_state > > active > > - watch -d 'grep md9 /proc/diskstats' > > remain unchanged > > - uptime > > load average: 0.00, 0.00, 0.00 > > - cat /sys/block/md9/md/safe_mode_delay > > 0.201 > > - echo 0.1 > /sys/block/md9/md/safe_mode_delay > > array_state remains active > > - echo clean > /sys/block/md9/md/array_state > > echo: write error: Device or resource busy > > - reboot (with or without prior check) > > array_state clean > > - After reboot, array remains clean until some specific > > jobs put it in constant active state. Such jobs so far > > identified: > > - echo check > /sys/block/md9/md/sync_action > > - run an rsnapshot job > > - start a qemu/kvm vm > > - Other jobs, like text/doc editing, multimedia playback, > > etc retain array_state clean > > This bug was introduced by > Commit: 20d0189b1012 ("block: Introduce new bio_split()") > in 3.14, and fixed by > Commit: 9b622e2bbcf0 ("raid10: increment write counter after bio is > split") in 4.8. > > Maybe the latter patch should be sent to -stable ?? > > NeilBrown NeilBrown, thank you for your swift and concise answer. I gather you are referring to kernel version numbers. The described behaviour was first noticed many months ago with kernel 2.6.37.6, and persisted after a system upgrade and kernel 4.4.38. However, after the upgrade two things were corrected, the timeout mismatch, and a Current_Pending_Sector in one of the drives; which may, or may not, explain the occurrence with the older kernel. Is this constant active state in the data array something to worry about and try kernel >= 4.8, or shall I let be? pdi ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: constant array_state active after specific jobs 2017-03-24 7:04 ` pdi @ 2017-03-26 22:42 ` NeilBrown 2017-03-28 13:44 ` pdi 0 siblings, 1 reply; 6+ messages in thread From: NeilBrown @ 2017-03-26 22:42 UTC (permalink / raw) To: pdi; +Cc: linux-raid, Shaohua Li [-- Attachment #1: Type: text/plain, Size: 3594 bytes --] On Fri, Mar 24 2017, pdi wrote: > On Fri, 24 Mar 2017 16:25:35 +1100 > NeilBrown <neilb@suse.com> wrote: > >> On Thu, Mar 23 2017, pdi wrote: >> >> > Greetings all, >> > >> > The problem in a nutshell is that an array is clean after boot, >> > until some specific jobs switch it to active where it remains until >> > reboot. >> > >> > A similar problem was discussed, and solved, in >> > https://www.spinics.net/lists/raid/msg46450.html. However, AFAICT, >> > it is not the same issue. >> > >> > I would be grateful for any insights as to why this happens and/or >> > how to prevent it. >> > >> > The relevant info follows, please let me know if anything further >> > might help. >> > >> > Many thanks in advance. >> > >> > - uname -a >> > Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 x86_64 >> > Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux >> > - mdadm -V >> > mdadm - v3.3.4 - 3rd August 2015 >> > - Desktop drives without sct/erc, >> > with timeout mismatch correction as per >> > https://raid.wiki.kernel.org/index.php/Timeout_Mismatch >> > - /dev/md9 is a raid10 array, 4 devices, far=2, >> > with various dirs used as samba and nfs shares >> > - The array is in *constant* array_state active >> > - mdadm -D /dev/md9 | grep 'State :' >> > State : active >> > - cat /sys/block/md9/md/array_state >> > active >> > - watch -d 'grep md9 /proc/diskstats' >> > remain unchanged >> > - uptime >> > load average: 0.00, 0.00, 0.00 >> > - cat /sys/block/md9/md/safe_mode_delay >> > 0.201 >> > - echo 0.1 > /sys/block/md9/md/safe_mode_delay >> > array_state remains active >> > - echo clean > /sys/block/md9/md/array_state >> > echo: write error: Device or resource busy >> > - reboot (with or without prior check) >> > array_state clean >> > - After reboot, array remains clean until some specific >> > jobs put it in constant active state. Such jobs so far >> > identified: >> > - echo check > /sys/block/md9/md/sync_action >> > - run an rsnapshot job >> > - start a qemu/kvm vm >> > - Other jobs, like text/doc editing, multimedia playback, >> > etc retain array_state clean >> >> This bug was introduced by >> Commit: 20d0189b1012 ("block: Introduce new bio_split()") >> in 3.14, and fixed by >> Commit: 9b622e2bbcf0 ("raid10: increment write counter after bio is >> split") in 4.8. >> >> Maybe the latter patch should be sent to -stable ?? >> >> NeilBrown > > NeilBrown, thank you for your swift and concise answer. > > I gather you are referring to kernel version numbers. The described > behaviour was first noticed many months ago with kernel 2.6.37.6, and > persisted after a system upgrade and kernel 4.4.38. However, after the > upgrade two things were corrected, the timeout mismatch, and a > Current_Pending_Sector in one of the drives; which may, or may not, > explain the occurrence with the older kernel. > > Is this constant active state in the data array something to worry about > and try kernel >= 4.8, or shall I let be? The only important consequence of the constant active state is that if your machine crashes at a moment when the array would otherwise have been idle, then a resync will be needed after reboot. Without the constant active state, that resync would not have been needed. If you have a write-intent bitmap, this is not particularly relevant. I cannot say how important it is to you to avoid a resync after a crash, so I don't know if you should just let it be or not. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: constant array_state active after specific jobs 2017-03-26 22:42 ` NeilBrown @ 2017-03-28 13:44 ` pdi 0 siblings, 0 replies; 6+ messages in thread From: pdi @ 2017-03-28 13:44 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid, Shaohua Li On Mon, 27 Mar 2017 09:42:29 +1100 NeilBrown <neilb@suse.com> wrote: > On Fri, Mar 24 2017, pdi wrote: > > > On Fri, 24 Mar 2017 16:25:35 +1100 > > NeilBrown <neilb@suse.com> wrote: > > > >> On Thu, Mar 23 2017, pdi wrote: > >> > >> > Greetings all, > >> > > >> > The problem in a nutshell is that an array is clean after boot, > >> > until some specific jobs switch it to active where it remains > >> > until reboot. > >> > > >> > A similar problem was discussed, and solved, in > >> > https://www.spinics.net/lists/raid/msg46450.html. However, > >> > AFAICT, it is not the same issue. > >> > > >> > I would be grateful for any insights as to why this happens > >> > and/or how to prevent it. > >> > > >> > The relevant info follows, please let me know if anything further > >> > might help. > >> > > >> > Many thanks in advance. > >> > > >> > - uname -a > >> > Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 > >> > x86_64 Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel > >> > GNU/Linux > >> > - mdadm -V > >> > mdadm - v3.3.4 - 3rd August 2015 > >> > - Desktop drives without sct/erc, > >> > with timeout mismatch correction as per > >> > https://raid.wiki.kernel.org/index.php/Timeout_Mismatch > >> > - /dev/md9 is a raid10 array, 4 devices, far=2, > >> > with various dirs used as samba and nfs shares > >> > - The array is in *constant* array_state active > >> > - mdadm -D /dev/md9 | grep 'State :' > >> > State : active > >> > - cat /sys/block/md9/md/array_state > >> > active > >> > - watch -d 'grep md9 /proc/diskstats' > >> > remain unchanged > >> > - uptime > >> > load average: 0.00, 0.00, 0.00 > >> > - cat /sys/block/md9/md/safe_mode_delay > >> > 0.201 > >> > - echo 0.1 > /sys/block/md9/md/safe_mode_delay > >> > array_state remains active > >> > - echo clean > /sys/block/md9/md/array_state > >> > echo: write error: Device or resource busy > >> > - reboot (with or without prior check) > >> > array_state clean > >> > - After reboot, array remains clean until some specific > >> > jobs put it in constant active state. Such jobs so far > >> > identified: > >> > - echo check > /sys/block/md9/md/sync_action > >> > - run an rsnapshot job > >> > - start a qemu/kvm vm > >> > - Other jobs, like text/doc editing, multimedia playback, > >> > etc retain array_state clean > >> > >> This bug was introduced by > >> Commit: 20d0189b1012 ("block: Introduce new bio_split()") > >> in 3.14, and fixed by > >> Commit: 9b622e2bbcf0 ("raid10: increment write counter after bio is > >> split") in 4.8. > >> > >> Maybe the latter patch should be sent to -stable ?? > >> > >> NeilBrown > > > > NeilBrown, thank you for your swift and concise answer. > > > > I gather you are referring to kernel version numbers. The described > > behaviour was first noticed many months ago with kernel 2.6.37.6, > > and persisted after a system upgrade and kernel 4.4.38. However, > > after the upgrade two things were corrected, the timeout mismatch, > > and a Current_Pending_Sector in one of the drives; which may, or > > may not, explain the occurrence with the older kernel. > > > > Is this constant active state in the data array something to worry > > about and try kernel >= 4.8, or shall I let be? > > The only important consequence of the constant active state is that if > your machine crashes at a moment when the array would otherwise have > been idle, then a resync will be needed after reboot. Without the > constant active state, that resync would not have been needed. > > If you have a write-intent bitmap, this is not particularly relevant. > > I cannot say how important it is to you to avoid a resync after a > crash, so I don't know if you should just let it be or not. > > NeilBrown NeilBrown, Thank you for your clear explanation. Best regards, pdi ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: constant array_state active after specific jobs 2017-03-24 5:25 ` NeilBrown 2017-03-24 7:04 ` pdi @ 2017-03-27 18:08 ` Shaohua Li 1 sibling, 0 replies; 6+ messages in thread From: Shaohua Li @ 2017-03-27 18:08 UTC (permalink / raw) To: NeilBrown; +Cc: pdi, linux-raid On Fri, Mar 24, 2017 at 04:25:35PM +1100, Neil Brown wrote: > On Thu, Mar 23 2017, pdi wrote: > > > Greetings all, > > > > The problem in a nutshell is that an array is clean after boot, until > > some specific jobs switch it to active where it remains until reboot. > > > > A similar problem was discussed, and solved, in > > https://www.spinics.net/lists/raid/msg46450.html. However, AFAICT, > > it is not the same issue. > > > > I would be grateful for any insights as to why this happens and/or how > > to prevent it. > > > > The relevant info follows, please let me know if anything further might > > help. > > > > Many thanks in advance. > > > > - uname -a > > Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 x86_64 > > Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux > > - mdadm -V > > mdadm - v3.3.4 - 3rd August 2015 > > - Desktop drives without sct/erc, > > with timeout mismatch correction as per > > https://raid.wiki.kernel.org/index.php/Timeout_Mismatch > > - /dev/md9 is a raid10 array, 4 devices, far=2, > > with various dirs used as samba and nfs shares > > - The array is in *constant* array_state active > > - mdadm -D /dev/md9 | grep 'State :' > > State : active > > - cat /sys/block/md9/md/array_state > > active > > - watch -d 'grep md9 /proc/diskstats' > > remain unchanged > > - uptime > > load average: 0.00, 0.00, 0.00 > > - cat /sys/block/md9/md/safe_mode_delay > > 0.201 > > - echo 0.1 > /sys/block/md9/md/safe_mode_delay > > array_state remains active > > - echo clean > /sys/block/md9/md/array_state > > echo: write error: Device or resource busy > > - reboot (with or without prior check) > > array_state clean > > - After reboot, array remains clean until some specific > > jobs put it in constant active state. Such jobs so far > > identified: > > - echo check > /sys/block/md9/md/sync_action > > - run an rsnapshot job > > - start a qemu/kvm vm > > - Other jobs, like text/doc editing, multimedia playback, > > etc retain array_state clean > > This bug was introduced by > Commit: 20d0189b1012 ("block: Introduce new bio_split()") > in 3.14, and fixed by > Commit: 9b622e2bbcf0 ("raid10: increment write counter after bio is split") > in 4.8. > > Maybe the latter patch should be sent to -stable ?? Sure, looks suitable, will do it now. Thanks, Shaohua ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-03-28 13:44 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-03-23 8:46 constant array_state active after specific jobs pdi 2017-03-24 5:25 ` NeilBrown 2017-03-24 7:04 ` pdi 2017-03-26 22:42 ` NeilBrown 2017-03-28 13:44 ` pdi 2017-03-27 18:08 ` Shaohua Li
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).