Linux RAID subsystem development
 help / color / mirror / Atom feed
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
To: Joel Parthemore <joel@parthemores.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: request for help on IMSM-metadata RAID-5 array
Date: Mon, 25 Sep 2023 11:44:20 +0200	[thread overview]
Message-ID: <20230925114420.0000302f@linux.intel.com> (raw)
In-Reply-To: <507b6ab0-fd8f-d770-ba82-28def5f53d25@parthemores.com>

On Sat, 23 Sep 2023 12:54:52 +0200
Joel Parthemore <joel@parthemores.com> wrote:

> Apologies in advance for the long email, but I wanted to include 
> everything that is asked for on the "asking for help" page associated 
> with the mailing list. The output from some of the requested commands is 
> pretty lengthy.
> 
> My home directory is on a three-disk RAID-5 array that, for whatever 
> reason (it seemed like a good idea at the time?), I built using the 
> hooks from the UEFI BIOS (or so I understand what I did). That is to 
> say, it's a "real" software-based RAID array in Linux that's built on a 
> "fake" RAID array in the UEFI BIOS. Mostly nothing important is stored 
> on the /home partition, but I forgot to back up a few important things 
> that are (or, at least, were). So I'd like to get the RAID array back if 
> I can, or know if I can't; and I will be extremely grateful to anyone 
> who can tell me one way or the other.
> 
> All was well for some number of years until a few days ago. After I 
> installed the latest KDE updates, the RAID array would lock up entirely 
> when I tried to log in to a new KDE Wayland session. It all came down to 
> one process that refused to die, running startplasma-wayland. Because 
> the process refused to die, the RAID array could not be stopped cleanly 
> and rebooting the computer therefore caused the RAID array to go out of 
> sync. After that, any attempt whatsoever to access the RAID array would 
> cause the RAID array to lock up again.
> 
> The first few times this happened, I was able to start the computer 
> without starting the RAID array, reassemble the RAID array using the 
> command mdadm --assemble --run --force /dev/md126 /dev/sda /dev/sde 
> /dev/sdc and have it working fine -- I could fix any filestore problems 
> with e2fsck, mount /home, log in to my home directory, do pretty much 
> whatever I wanted -- until I tried logging into a new KDE Wayland 
> session again. This happened several times while I was trying to 
> troubleshoot the problem with startplasma-wayland.
> 
> Unfortunately, one time this didn't work. I was still able to start the 
> computer without starting the RAID array, reassemble it and reboot with 
> the RAID array looking seemingly okay (according to mdadm -D) BUT this 
> time, any attempt to access the RAID array or even just stop the array 
> (mdadm --stop /dev/md126, mdadm --stop /dev/md127) once it was started 
> would cause the RAID array to lock up. That means (I think) that I can't 
> create an image of the array contents using dd, which is what -- of 
> course -- I should have done in the first place. (I could assemble the 
> RAID array read-only, but the RAID array is out of sync because it 
> didn't shut down properly.)
> 
> I'm guessing that the contents of the filestore on the RAID array are 
> probably still there. Does anyone have suggestions on getting the RAID 
> array working properly again and accessing them? I have avoided doing 
> anything further myself because, of course, if the contents of the 
> filestore are still there, I don't want to do anything to jeopardize 
> them. You may tell me that I've done too much already. :-)

Hi Joel,
sorry for late response, I see that you were able to recover the data!
I was few days off.

I think that metadata manager is down or broken from some reasons.
#systemctl status mdmon@md127.service

I you will get the problem again, please try (but do not abuse- use it as last
resort!!):
#systemctl restart mdmon@md127.service

We know that there was a change in systemd and it causes that our userspace
metadata manager was not responsible because it couldn't be restarted after
switch root. Issue is fixed in upstream:
https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/?id=723d1df4946eb40337bf494f9b2549500c1399b2

I didn't read whole thread but issue matches for me.
Hopefully, you will find it useful.

Thanks,
Mariusz

  parent reply	other threads:[~2023-09-25  9:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-23 10:54 request for help on IMSM-metadata RAID-5 array Joel Parthemore
2023-09-23 11:24 ` Roman Mamedov
2023-09-23 15:18   ` Joel Parthemore
2023-09-23 15:35     ` Roman Mamedov
2023-09-23 15:45       ` Joel Parthemore
2023-09-23 18:49       ` Joel Parthemore
2023-09-25  1:43         ` Yu Kuai
2023-09-25 15:57           ` Joel Parthemore
2023-09-26  1:10             ` Yu Kuai
2023-09-29 19:44               ` Joel Parthemore
     [not found]                 ` <a0b8a693-5d9c-d354-5afc-4500b78a983e@huaweicloud.com>
2023-10-05  7:28                   ` Joel Parthemore
2023-09-25  9:44 ` Mariusz Tkaczyk [this message]
2023-09-25 15:52   ` Joel Parthemore
2023-09-25 16:43     ` Mariusz Tkaczyk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230925114420.0000302f@linux.intel.com \
    --to=mariusz.tkaczyk@linux.intel.com \
    --cc=joel@parthemores.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox