From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA60EEE57CC for ; Sun, 10 Sep 2023 02:56:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345374AbjIJC4D (ORCPT ); Sat, 9 Sep 2023 22:56:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233350AbjIJC4C (ORCPT ); Sat, 9 Sep 2023 22:56:02 -0400 Received: from www18.qth.com (www18.qth.com [69.16.238.59]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 634031B5 for ; Sat, 9 Sep 2023 19:55:57 -0700 (PDT) Received: from [73.207.192.158] (port=49014 helo=jpo) by www18.qth.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1qfAcA-00040i-2G for linux-raid@vger.kernel.org; Sat, 09 Sep 2023 21:55:56 -0500 Date: Sun, 10 Sep 2023 02:55:55 +0000 From: David T-G To: Linux RAID list Subject: Re: all of my drives are spares Message-ID: <20230910025554.GD1085@jpo> Reply-To: Linux RAID list References: <20230908025035.GB1085@jpo> <20230909112656.GC1085@jpo> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - www18.qth.com X-AntiAbuse: Original Domain - vger.kernel.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - justpickone.org X-Get-Message-Sender-Via: www18.qth.com: authenticated_id: dmail@justpickone.org X-Authenticated-Sender: www18.qth.com: dmail@justpickone.org X-Source: X-Source-Args: X-Source-Dir: Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org Wol, et al -- ...and then Wol said... % On 09/09/2023 12:26, David T-G wrote: % > % > Wow ... I'm used to responses pointing out either what I've left % > out or how stupid my setup is, but total silence ... How did I % > offend and how can I fix it? % % Sorry, it's usually me that's the quick response, everyone else takes ages, True! % and I'm feeling a bit burnt out with life in general at the moment. Oh! Sorry to hear that. Not much I can do from here, but I can think you a hug :-) I hope things look up soon. % > % > I sure could use advice on the current hangup before perhaps just % > destroying my entire array with the wrong commands ... % % I wonder if a controlled reboot would fix it. Or just do a --stop followed I've tried a couple of reboots; they're stuck that way. I'll try the stop and assemble. % by an assemble. The big worry is the wildly varying event counts. Do your % arrays have journals. No, I don't think so, unless they're created automagically. Alas, I don't recall the exact creation command :-/ How can I check for usre? The -D flag output doesn't mention a journal, whether enabled or missing. % > % > With fingers crossed, % > :-D % % If the worst comes to the worst, try a forced assemble with the minimum % possible drives (no redundancy). Pick the drives with the highest event % counts. You can then re-add the remaining ones if that works. Hmmmmm ... For a RAID5 array, the minimum would be one left out, right? So five instead of all six. And the event counts seem to be three and three, which is interesting but also doesn't point to any one favorite to drop :-/ % % Iirc this is actually not uncommon and it shouldn't be hard to recover from. % I really ought to go through the archives, find a bunch of occasions, and % write it up. In your copious free time :-) That would, indeed, be awesome. % % The only real worry is that the varying event counts mean that some data % corruption is likely. Recent files, hopefully nothing important. One thing Gaaaaah! Fortunately, everything is manually mirrored out to external drives, so if everything did go tits-up I could reload. I'll come up with a diff using the externals as the sources and check ... eventually. % that's just struck me, this is often caused by a drive failing some while % back, and then a glitch on a second drive brings the whole thing down. When % did you last check your array was fully functional? Let me get back to you on that. It's actually been a couple of weeks in this state just waiting to get to it; life has been interesting here, too. But I have a heartbeat script that might have captured happy data ... These are all pretty new drives after the 4T drive disaster a year (or two?) ago, so they *should* be OK. % % Cheers, % Wol Thanks again & stay tuned :-D -- David T-G See http://justpickone.org/davidtg/email/ See http://justpickone.org/davidtg/tofu.txt