From: Roman Mamedov <rm@romanrm.net>
To: Ryan Patterson <ryan.goat@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: mdadm resync causes stable system to crash every 2 or 3 hours
Date: Tue, 7 Sep 2021 12:52:01 +0500 [thread overview]
Message-ID: <20210907125201.0cc77658@natsu> (raw)
In-Reply-To: <CA+Kggd7mUF9MWdJsLtAQMv=KXtwaNvj6BqfM+NMyffE86iHBoQ@mail.gmail.com>
On Mon, 6 Sep 2021 20:44:31 -0400
Ryan Patterson <ryan.goat@gmail.com> wrote:
> My file server is usually very stable. The past week I had two mdadm
> arrays that required recync operations.
> * newly created raid6 array (14 x 16TB seagate exos)
> * existing raid 6 array, after a reboot resync on hot spare (14 x 4TB
> seagate barracuda)
>
> During both resync operations (they ran one at a time) the system
> would routinely experience a major error and require a hard reboot,
> every two or three hours. I saw several errors, such as:
> * kernel watchdog soft lockups [md127_raid6:364]
> * general protection faults (I have a few saved with the full exception stack)
> * exceptions in iommu routines (again I have the full error with
> exception stack saved)
> * full system lockup
So in other words the server is very stable, unless asked to do full-speed
reads from all disks at the same time.
I'd suggest to check or improve cooling on the HBA cards, and then try a
different PSU.
> I doubt there is a bug in mdadm that caused this behavior. But it was
> very predictable and repeatable while the resync operations were in
> progress.
>
> How can I avoid these errors the next time I have an array in need of a resync?
>
> OS: debian 11 bullseye
> kernel: 5.10.0-8-amd64 #1 SMP Debian 5.10.46-4 (2021-08-03)
> mdadm: v4.1 - 2018-10-01
> sata HBA: 3 x LSI SAS 9201-16i
> _____________
> Ryan Patterson
> May the wings of liberty never lose a feather.
--
With respect,
Roman
next prev parent reply other threads:[~2021-09-07 7:57 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-07 0:44 mdadm resync causes stable system to crash every 2 or 3 hours Ryan Patterson
2021-09-07 4:59 ` Wols Lists
2021-09-07 22:38 ` Ryan Patterson
2021-09-07 7:52 ` Roman Mamedov [this message]
2021-09-07 9:18 ` Roman Mamedov
2021-09-07 17:55 ` Roger Heflin
2021-09-07 22:55 ` Ryan Patterson
2022-01-15 15:46 ` Ryan Patterson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210907125201.0cc77658@natsu \
--to=rm@romanrm.net \
--cc=linux-raid@vger.kernel.org \
--cc=ryan.goat@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.