add fails: nvme1n1p2 does not have a valid v1.2 superblock, not importing

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* add fails: nvme1n1p2 does not have a valid v1.2 superblock, not importing
@ 2025-05-06 10:25 Daniel Buschke
  2025-05-08  6:27 ` Roman Mamedov
  0 siblings, 1 reply; 2+ messages in thread
From: Daniel Buschke @ 2025-05-06 10:25 UTC (permalink / raw)
  To: linux-raid

Hi,
before I start: I am a bit unsure if this is the correct place for this 
issue. If not, please be kind and tell me :)

I created an RAID1 array which was running fine. This array is currently 
in "degraded" state because one device failed. The device which failed 
just vanished and couldn't be removed from the array before being 
replaced with a new device. After rebooting I tried to add the new 
device by:

# mdadm --manage /dev/md1 --add /dev/nvme1n1p2

which fails with "mdadm: add new device failed for /dev/nvme1n1p2 as 2: 
Invalid argument". Output of dmesg is:

md: nvme1n1p2 does not have a valid v1.2 superblock, not importing!
md: md_import_device returned -22

I am a bit confused by this message as I assumed that the raid subsystem 
should create the suberblock on the new device. After searching WWW I 
found that a lot of people have the same problem and come up with ... 
interesting ... solutions which I am currently not willed to test 
because I didn't understood the problem yet. So, maybe someone can 
direct me to understand what is going on:

1. What exactly does this error message mean? I think replacing a failed 
drive with a new one is what RAID is for? So this shouldn't be an issue 
at all?

2. During my search I got the feeling that the problem is that the 
failed drive is somehow still "present" in the raid. Thus the add is 
handled as a "re add" which fails because there is no md superblock on 
the new device. Is my conclusion correct?

3. If 2. is correct how do I remove the failed but not really present 
device? Commands like "mdadm ... --remove failed" did not help.

4. I already replaced old devices in this RAID successfully before. What 
may have changed that this issue happens?

regards
Daniel

Some technical details which might be useful:

-------------------- 8< --------------------
# mdadm --detail /dev/md1
/dev/md1:
            Version : 1.2
      Creation Time : Tue Apr 29 16:38:51 2025
         Raid Level : raid1
         Array Size : 997973312 (951.74 GiB 1021.92 GB)
      Used Dev Size : 997973312 (951.74 GiB 1021.92 GB)
       Raid Devices : 2
      Total Devices : 1
        Persistence : Superblock is persistent

      Intent Bitmap : Internal

        Update Time : Tue May  6 11:47:38 2025
              State : clean, degraded
     Active Devices : 1
    Working Devices : 1
     Failed Devices : 0
      Spare Devices : 0

Consistency Policy : bitmap

               Name : rescue:1
               UUID : 7b7a8b41:e9cfa3ad:f1224061:1d0e7936
             Events : 28548

     Number   Major   Minor   RaidDevice State
        -       0        0        0      removed
        3     259        3        1      active sync   /dev/nvme0n1p2

-------------------- 8< --------------------
# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 nvme0n1p2[3]
       997973312 blocks super 1.2 [2/1] [_U]
       bitmap: 7/8 pages [28KB], 65536KB chunk

unused devices: <none>

-------------------- 8< --------------------
# uname -a
Linux example.org 6.14.4-1-default #1 SMP PREEMPT_DYNAMIC Fri Apr 25 
09:13:41 UTC 2025 (584fafa) x86_64 x86_64 x86_64 GNU/Linux

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: add fails: nvme1n1p2 does not have a valid v1.2 superblock, not importing
  2025-05-06 10:25 add fails: nvme1n1p2 does not have a valid v1.2 superblock, not importing Daniel Buschke
@ 2025-05-08  6:27 ` Roman Mamedov
  0 siblings, 0 replies; 2+ messages in thread
From: Roman Mamedov @ 2025-05-08  6:27 UTC (permalink / raw)
  To: Daniel Buschke; +Cc: linux-raid

On Tue, 6 May 2025 12:25:13 +0200
Daniel Buschke <damage@devloop.de> wrote:

> 1. What exactly does this error message mean? I think replacing a failed 
> drive with a new one is what RAID is for? So this shouldn't be an issue 
> at all?
> 
> 2. During my search I got the feeling that the problem is that the 
> failed drive is somehow still "present" in the raid. Thus the add is 
> handled as a "re add" which fails because there is no md superblock on 
> the new device. Is my conclusion correct?
> 
> 3. If 2. is correct how do I remove the failed but not really present 
> device? Commands like "mdadm ... --remove failed" did not help.
> 
> 4. I already replaced old devices in this RAID successfully before. What 
> may have changed that this issue happens?

I agree that it is a weird error to get in this situation. "man mdadm" gives
something to try:

       --add-spare
              Add  a  device as a spare.  This is similar to --add except that
              it does not attempt --re-add first.  The device will be added as
              a  spare  even  if it looks like it could be an recent member of
              the array.

Another idea (from the same man page) would be "mdadm ... --fail detached".

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-05-08  6:36 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-06 10:25 add fails: nvme1n1p2 does not have a valid v1.2 superblock, not importing Daniel Buschke
2025-05-08  6:27 ` Roman Mamedov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).