public inbox for linux-raid@vger.kernel.org
 help / color / mirror / Atom feed
From: Guoqing Jiang <jgq516@gmail.com>
To: linux-raid@vger.kernel.org
Subject: Re: Intermittent stalling of all MD IO, Debian buster (4.19.0-16)
Date: Fri, 18 Jun 2021 13:35:08 +0800	[thread overview]
Message-ID: <33236a83-a14d-a9e0-5384-91aa007858dc@gmail.com> (raw)
In-Reply-To: <20210616150549.ojm3nvdamkmqb6ev@bitfolk.com>

Hi Andy,

On 6/16/21 11:05 PM, Andy Smith wrote:
> Hi Guoqing,
>
> Thanks for looking at this.
>
> On Wed, Jun 16, 2021 at 11:57:33AM +0800, Guoqing Jiang wrote:
>> The above looks like the bio for sb write was throttled by wbt, which caused
>> the first calltrace.
>> I am wondering if there  were intensive IOs happened to the
>> underlying device of md5, which triggered wbt to throttle sb
>> write, or can you access the underlying device directly?
> Next time it occurs I can check if I am able to read from the SSDs
> that make up the MD device, if that information would be helpful.
>
> I have never been able to replicate the problem in a test
> environment so it is likely that it needs to be under heavy load for
> it to happen.

I guess so, and a reliable reproducer definitely  helps us to analysis 
the root cause.

>> And there was a report [1] for raid5 which may related to wbt throttle as
>> well, not sure if the
>> change [2] could help or not.
>>
>> [1]. https://lore.kernel.org/linux-raid/d3fced3f-6c2b-5ffa-fd24-b24ec6e7d4be@xmyslivec.cz/
>> [2]. https://lore.kernel.org/linux-raid/cb0f312e-55dc-cdc4-5d2e-b9b415de617f@gmail.com/
> All of my MD arrays tend to be RAID-1 or RAID-10, two devices, no
> journal, internal bitmap. I see the reporter of this problem was
> using RAID-6 with an external write journal. I can still build a
> kernel with this patch and try it out, if you think it could possibly
> help.

Yes, because both of the two issues have wbt related call traces though 
raid level is different.

> The long time between incidents obviously makes things
> extra challenging.
>
> The next step I have taken is to put the buster-backports kernel
> package (5.10.24-1~bpo10+1) on two test servers, and will also boot
> the production hosts into this if they should experience the problem
> again.

Good luck :).

Thanks,
Guoqing

      reply	other threads:[~2021-06-18  5:35 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-12 12:41 Intermittent stalling of all MD IO, Debian buster (4.19.0-16) Andy Smith
2021-06-12 13:39 ` Andy Smith
2021-06-16  3:57 ` Guoqing Jiang
2021-06-16 15:05   ` Andy Smith
2021-06-18  5:35     ` Guoqing Jiang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33236a83-a14d-a9e0-5384-91aa007858dc@gmail.com \
    --to=jgq516@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox