* One missing device = fs not detected; upgrade things first?
@ 2024-01-29 17:46 Andy Smith
2024-01-29 18:02 ` Andy Smith
2024-01-29 18:24 ` Remi Gauvin
0 siblings, 2 replies; 4+ messages in thread
From: Andy Smith @ 2024-01-29 17:46 UTC (permalink / raw)
To: linux-btrfs
Hi,
I cleanly shut down a machine and powered it off, then upon powering
up two things happened.
Firstly, one of the drives no longer responds or registers with the
OS in any way. As in there;s no device node for it and nothing in
the kernel logs.
Secondly, the btrfs filesystem that is spread across 7 of the
drives (including the missing one) also does not appear. As in,
Linux does not detect a btrfs filesystem on any of the remaining
drives, though the drives themselves appear to be there all fine.
Here's lsblk:
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sde 8:64 0 931.5G 0 disk
sdf 8:80 0 1.8T 0 disk
sdg 8:96 0 1.8T 0 disk
sdh 8:112 0 931.5G 0 disk
sdi 8:128 0 1.8T 0 disk
sdj 8:144 0 2.7T 0 disk
(I've omitted details of sd{a,b,c,d} as these are system drives and
not involved here.)
The btrfs fs is directly on those drives so it is expected that
lsblk shows no partitions, but it is not expected that it doesn't
show an fs.
Normally that would be e-k, but as I say, one appears dead. I am
wondering why this has affected my (raid1 profile for data and
metadata) btrfs filesystem though.
I did a "btrfs check" just to see what could be seen. That is still
progressing. It hasn't yet said anything other than a lot of
instances of "failed to load free space cache for block group
…":
# btrfs check -p /dev/sde
Opening filesystem to check...
Checking filesystem on /dev/sde
UUID: 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
[1/7] checking root items (0:07:57 elapsed, 3055030 items checked)
[2/7] checking extents (0:08:54 elapsed, 1334143 items checked)
failed to load free space cache for block group 15011172843520d, 1 items checked)
failed to load free space cache for block group 15012246585344
[lots more of that]
[3/7] checking free space cache (0:05:04 elapsed, 4220 items checked)
[4/7] checking fs roots (0:51:38 elapsed, 126792 items checked)
[5/7] checking csums (without verifying data) (0:13:45 elapsed, 2123139 items checked)
(still going)
If this completes without saying anything else of interest, should I
dare running it in repair mode? I do have backups.
If I should, then I know I should run this with more up to date
tools, because this is a Debian 10 machine with kernel
4.19.0-26-amd64 and btrfs-progs v4.20.1.
Should I build new btrfs-progs and then a new kernel and just boot
with those to see what happens, and then try the check --repair? Or
should I just build the new kernel and see what that makes of the
devices first, then build new btrfs-tools if I am still to run
check?
I did try mount -odegraded but no btrfs superblock is found, so
there is something up besides the missing drive.
Thanks,
Andy
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: One missing device = fs not detected; upgrade things first?
2024-01-29 17:46 One missing device = fs not detected; upgrade things first? Andy Smith
@ 2024-01-29 18:02 ` Andy Smith
2024-01-29 18:24 ` Remi Gauvin
1 sibling, 0 replies; 4+ messages in thread
From: Andy Smith @ 2024-01-29 18:02 UTC (permalink / raw)
To: linux-btrfs
On Mon, Jan 29, 2024 at 05:46:17PM +0000, Andy Smith wrote:
> # btrfs check -p /dev/sde
> Opening filesystem to check...
> Checking filesystem on /dev/sde
> UUID: 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
> [1/7] checking root items (0:07:57 elapsed, 3055030 items checked)
> [2/7] checking extents (0:08:54 elapsed, 1334143 items checked)
> failed to load free space cache for block group 15011172843520d, 1 items checked)
> failed to load free space cache for block group 15012246585344
> [lots more of that]
> [3/7] checking free space cache (0:05:04 elapsed, 4220 items checked)
> [4/7] checking fs roots (0:51:38 elapsed, 126792 items checked)
> [5/7] checking csums (without verifying data) (0:13:45 elapsed, 2123139 items checked)
>
> (still going)
It just finished. Here was the remaining output:
[5/7] checking csums (without verifying data) (0:30:31 elapsed, 5327794 items checked)
[6/7] checking root refs (0:00:00 elapsed, 18 items checked)
[7/7] checking quota groups skipped (not enabled on this FS)
found 4088505802752 bytes used, no error found
total csum bytes: 3983137248
total tree bytes: 5464555520
total fs tree bytes: 526884864
total extent tree bytes: 364138496
btree space waste bytes: 525763765
file data blocks allocated: 4135503101952
referenced 4101458530304
I didn't see anything bad except the mass of "failed to load free
space cache for block group …". Is there anything that is safe to
try without newer kernel and btrfs-progs?
Thanks,
Andy
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: One missing device = fs not detected; upgrade things first?
2024-01-29 17:46 One missing device = fs not detected; upgrade things first? Andy Smith
2024-01-29 18:02 ` Andy Smith
@ 2024-01-29 18:24 ` Remi Gauvin
2024-01-29 21:41 ` Andy Smith
1 sibling, 1 reply; 4+ messages in thread
From: Remi Gauvin @ 2024-01-29 18:24 UTC (permalink / raw)
To: Andy Smith, linux-btrfs
On 2024-01-29 12:46 p.m., Andy Smith wrote:
>
> Here's lsblk:
>
> # lsblk
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> sde 8:64 0 931.5G 0 disk
> sdf 8:80 0 1.8T 0 disk
> sdg 8:96 0 1.8T 0 disk
> sdh 8:112 0 931.5G 0 disk
> sdi 8:128 0 1.8T 0 disk
> sdj 8:144 0 2.7T 0 disk
>
>
> The btrfs fs is directly on those drives so it is expected that
> lsblk shows no partitions, but it is not expected that it doesn't
> show an fs.
That is what lsblk output looks like to me. There is no filesystem
information in the output. Are you confusing lsblk and blkid?
> Should I build new btrfs-progs and then a new kernel and just boot
> with those to see what happens, and then try the check --repair? Or
> should I just build the new kernel and see what that makes of the
> devices first, then build new btrfs-tools if I am still to run
> check?
I would suggest the output of btrfs filesystem show, as well as the
exact mount command and any dmesg output when mount command is run.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: One missing device = fs not detected; upgrade things first?
2024-01-29 18:24 ` Remi Gauvin
@ 2024-01-29 21:41 ` Andy Smith
0 siblings, 0 replies; 4+ messages in thread
From: Andy Smith @ 2024-01-29 21:41 UTC (permalink / raw)
To: Remi Gauvin; +Cc: linux-btrfs
Hi Remi,
On Mon, Jan 29, 2024 at 01:24:09PM -0500, Remi Gauvin wrote:
> That is what lsblk output looks like to me. There is no filesystem
> information in the output. Are you confusing lsblk and blkid?
Maybe. But see later…
> > Should I build new btrfs-progs and then a new kernel and just boot
> > with those to see what happens, and then try the check --repair? Or
> > should I just build the new kernel and see what that makes of the
> > devices first, then build new btrfs-tools if I am still to run
> > check?
>
> I would suggest the output of btrfs filesystem show, as well as the
> exact mount command and any dmesg output when mount command is run.
Hmm,. strange. Now:
# btrfs fi sh
Label: 'tank' uuid: 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
Total devices 7 FS bytes used 3.72TiB
devid 5 size 2.73TiB used 2.44TiB path /dev/sdj
devid 6 size 1.82TiB used 1.53TiB path /dev/sdf
devid 8 size 931.51GiB used 839.00GiB path /dev/sdh
devid 9 size 931.51GiB used 857.00GiB path /dev/sde
devid 10 size 1.75TiB used 1.67TiB path /dev/sdg
devid 12 size 1.75TiB used 548.50GiB path /dev/sdi
*** Some devices missing
# mount -t btrfs -odegraded /dev/sde /srv/tank
(long pause, but eventually worked)
Logged:
2024-01-29T21:14:01.748805+00:00 specialbrew.localnet kernel: [19014.852866] BTRFS info (device sdj): allowing degraded mounts
2024-01-29T21:14:01.748874+00:00 specialbrew.localnet kernel: [19014.852873] BTRFS info (device sdj): disk space caching is enabled
2024-01-29T21:14:01.768772+00:00 specialbrew.localnet kernel: [19014.866523] BTRFS warning (device sdj): devid 11 uuid 296b4aa0-434d-408e-8b8b-f11d93186a11 is missing
2024-01-29T21:14:01.768772+00:00 specialbrew.localnet kernel: [19014.866523] BTRFS warning (device sdj): devid 11 uuid 296b4aa0-434d-408e-8b8b-f11d93186a11 is missing
2024-01-29T21:14:03.512856+00:00 specialbrew.localnet kernel: [19016.615549] BTRFS info (device sdj): bdev (null) errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
Admittedly I did not try "btrfs filesystem show" before, as the
complete lack of spotting an fs on there, and "btrfs dev scan"
seemingly doing nothing threw me a bit. But I DID try a mount
before, which failed saying it couldn't find a superblock. Yet now
it seems to have worked okay.
Everything seems alright, I just need to work out what happened with
that one drive.
As for lsblk, it now says:
# lsblk
sde 8:64 0 931.5G 0 disk
sdf 8:80 0 1.8T 0 disk
sdg 8:96 0 1.8T 0 disk
sdh 8:112 0 931.5G 0 disk
sdi 8:128 0 1.8T 0 disk
sdj 8:144 0 2.7T 0 disk /srv/tank
but I suppose only because that is now mounted.
Thanks!
Andy
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-01-29 21:41 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-29 17:46 One missing device = fs not detected; upgrade things first? Andy Smith
2024-01-29 18:02 ` Andy Smith
2024-01-29 18:24 ` Remi Gauvin
2024-01-29 21:41 ` Andy Smith
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox