public inbox for linux-raid@vger.kernel.org
 help / color / mirror / Atom feed
* mdadm resync causes stable system to crash every 2 or 3 hours
@ 2021-09-07  0:44 Ryan Patterson
  2021-09-07  4:59 ` Wols Lists
  2021-09-07  7:52 ` Roman Mamedov
  0 siblings, 2 replies; 8+ messages in thread
From: Ryan Patterson @ 2021-09-07  0:44 UTC (permalink / raw)
  To: linux-raid

My file server is usually very stable.  The past week I had two mdadm
arrays that required recync operations.
* newly created raid6 array (14 x 16TB seagate exos)
* existing raid 6 array, after a reboot resync on hot spare (14 x 4TB
seagate barracuda)

During both resync operations (they ran one at a time) the system
would routinely experience a major error and require a hard reboot,
every two or three hours.  I saw several errors, such as:
* kernel watchdog soft lockups [md127_raid6:364]
* general protection faults (I have a few saved with the full exception stack)
* exceptions in iommu routines (again I have the full error with
exception stack saved)
* full system lockup

I doubt there is a bug in mdadm that caused this behavior.  But it was
very predictable and repeatable while the resync operations were in
progress.

How can I avoid these errors the next time I have an array in need of a resync?

OS: debian 11 bullseye
kernel: 5.10.0-8-amd64 #1 SMP Debian 5.10.46-4 (2021-08-03)
mdadm: v4.1 - 2018-10-01
sata HBA: 3 x LSI SAS 9201-16i
_____________
Ryan Patterson
May the wings of liberty never lose a feather.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-01-15 15:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-09-07  0:44 mdadm resync causes stable system to crash every 2 or 3 hours Ryan Patterson
2021-09-07  4:59 ` Wols Lists
2021-09-07 22:38   ` Ryan Patterson
2021-09-07  7:52 ` Roman Mamedov
2021-09-07  9:18   ` Roman Mamedov
2021-09-07 17:55     ` Roger Heflin
2021-09-07 22:55     ` Ryan Patterson
2022-01-15 15:46       ` Ryan Patterson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox