Re: constant array_state active after specific jobs

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: NeilBrown <neilb@suse.com>
To: pdi <pdi@otenet.gr>
Cc: linux-raid@vger.kernel.org, Shaohua Li <shli@kernel.org>
Subject: Re: constant array_state active after specific jobs
Date: Mon, 27 Mar 2017 09:42:29 +1100	[thread overview]
Message-ID: <87lgrr8xt6.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20170324090442.3d01347b@hal9.pdi.lan>

[-- Attachment #1: Type: text/plain, Size: 3594 bytes --]

On Fri, Mar 24 2017, pdi wrote:

> On Fri, 24 Mar 2017 16:25:35 +1100
> NeilBrown <neilb@suse.com> wrote:
>
>> On Thu, Mar 23 2017, pdi wrote:
>> 
>> > Greetings all,
>> >
>> > The problem in a nutshell is that an array is clean after boot,
>> > until some specific jobs switch it to active where it remains until
>> > reboot.
>> >
>> > A similar problem was discussed, and solved, in 
>> > https://www.spinics.net/lists/raid/msg46450.html. However, AFAICT,
>> > it is not the same issue.
>> >
>> > I would be grateful for any insights as to why this happens and/or
>> > how to prevent it.
>> >
>> > The relevant info follows, please let me know if anything further
>> > might help.
>> >
>> > Many thanks in advance.
>> >
>> > - uname -a
>> >   Linux hostname 4.4.38 #1 SMP Sun Dec 11 16:03:41 CST 2016 x86_64
>> >   Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz GenuineIntel GNU/Linux
>> > - mdadm -V
>> >   mdadm - v3.3.4 - 3rd August 2015
>> > - Desktop drives without sct/erc,
>> >   with timeout mismatch correction as per
>> >   https://raid.wiki.kernel.org/index.php/Timeout_Mismatch
>> > - /dev/md9 is a raid10 array, 4 devices, far=2,
>> >   with various dirs used as samba and nfs shares
>> > - The array is in *constant* array_state active
>> > - mdadm -D /dev/md9 | grep 'State :'
>> >   State : active
>> > - cat /sys/block/md9/md/array_state
>> >   active
>> > - watch -d 'grep md9 /proc/diskstats'
>> >   remain unchanged
>> > - uptime
>> >   load average: 0.00, 0.00, 0.00
>> > - cat /sys/block/md9/md/safe_mode_delay
>> >   0.201
>> > - echo 0.1 > /sys/block/md9/md/safe_mode_delay
>> >   array_state remains active
>> > - echo clean > /sys/block/md9/md/array_state
>> >   echo: write error: Device or resource busy
>> > - reboot (with or without prior check)
>> >   array_state clean
>> > - After reboot, array remains clean until some specific
>> >   jobs put it in constant active state. Such jobs so far
>> >   identified:
>> >   - echo check > /sys/block/md9/md/sync_action
>> >   - run an rsnapshot job
>> >   - start a qemu/kvm vm
>> > - Other jobs, like text/doc editing, multimedia playback,
>> >   etc retain array_state clean  
>> 
>> This bug was introduced by
>> Commit: 20d0189b1012 ("block: Introduce new bio_split()")
>> in 3.14, and fixed by
>> Commit: 9b622e2bbcf0 ("raid10: increment write counter after bio is
>> split") in 4.8.
>> 
>> Maybe the latter patch should be sent to -stable ??
>> 
>> NeilBrown
>
> NeilBrown, thank you for your swift and concise answer.
>
> I gather you are referring to kernel version numbers. The described
> behaviour was first noticed many months ago with kernel 2.6.37.6, and
> persisted after a system upgrade and kernel 4.4.38. However, after the
> upgrade two things were corrected, the timeout mismatch, and a
> Current_Pending_Sector in one of the drives; which may, or may not,
> explain the occurrence with the older kernel.
>
> Is this constant active state in the data array something to worry about
> and try kernel >= 4.8, or shall I let be?

The only important consequence of the constant active state is that if
your machine crashes at a moment when the array would otherwise have
been idle, then a resync will be needed after reboot.  Without the
constant active state, that resync would not have been needed.

If you have a write-intent bitmap, this is not particularly relevant.

I cannot say how important it is to you to avoid a resync after a crash,
so I don't know if you should just let it be or not.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

next prev parent reply	other threads:[~2017-03-26 22:42 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-23  8:46 constant array_state active after specific jobs pdi
2017-03-24  5:25 ` NeilBrown
2017-03-24  7:04   ` pdi
2017-03-26 22:42     ` NeilBrown [this message]
2017-03-28 13:44       ` pdi
2017-03-27 18:08   ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lgrr8xt6.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=pdi@otenet.gr \
    --cc=shli@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).