* super_total_bytes mismatch with fs_devices total_rw_bytes
@ 2021-01-28 11:03 Andrew Vaughan
2021-01-28 11:15 ` Hugo Mills
0 siblings, 1 reply; 2+ messages in thread
From: Andrew Vaughan @ 2021-01-28 11:03 UTC (permalink / raw)
To: Btrfs BTRFS
[Please cc me on replies. I'm not subscribed to the list.]
Hi
I have a 3 device btrfs raid1 filesystem on my home linux system. The
filesystem is probably over 10 years old at this point, and has had a
number of drives added and removed over it's life. It currently
contains over 8 TB of data and is mainly used as write once, read
rarely/never. There are a number of daily/weekly/monthly snapshots
(I'm guessing about 50 total). Btrfs-progs is from Debian Testing and
currently is version 5.10-1.
I recently used btrfs replace to replace a 4 TB drive with a new 8 TB
drive. After that completed, I did a scrub to ensure that the new
drive wasn't reporting any read errors. (The old drive is still
physically installed in the system, but I expect replace should have
wiped the filesystem metadata, so that the kernel doesn't confuse it
with the new drive).
Today I did '# btrfs fi resize 4:max /srv/shared' as preparation for a
balance to make the extra drive space available. (The old drives are
all fairly full. About 130 GB free space on each. I initially tried
btrfs fi resize max /srv/shared as the syntax on the manpage implies
that devid is optional. Since that command errored, I assume it
didn't change the filesystem).
Also today I updated the kernel to
linux-image-5.10.0-2-amd64_5.10.9-1_amd64.deb, the latest kernel
available in Debian Testing.
linux-image-5.10.0-1-amd64_5.10.4-1_amd64.deb was also installed and I
think that was the running kernel during the replace, scrub and resize
operations.
After installing the new kernel I rebooted. (# shutdown -r now. I
wasn't really paying attention, but the shutdown seemed to take longer
than normal. At one stage I actually thought the system had actually
hung).
After the reboot the btrfs filesystem failed to mount. dmesg | grep
-i btrfs output
[ 5.650300] Btrfs loaded, crc32c=crc32c-generic
[ 6.182298] BTRFS: device label samba.btrfs devid 4 transid 1281994
/dev/sdd1 scanned by btrfs (173)
[ 6.182887] BTRFS: device label samba.btrfs devid 5 transid 1281994
/dev/sde1 scanned by btrfs (173)
[ 6.183711] BTRFS: device label samba.btrfs devid 8 transid 1281994
/dev/sdb1 scanned by btrfs (173)
[ 16.492471] BTRFS info (device sdd1): disk space caching is enabled
[ 16.547330] BTRFS error (device sdd1): super_total_bytes
22004298366976 mismatch with fs_devices total_rw_bytes 22004298370048
[ 16.547561] BTRFS error (device sdd1): failed to read chunk tree: -22
[ 16.560975] BTRFS error (device sdd1): open_ctree failed
That filesystem is used for archival storage and I don't need it in
the short term, so I simply commented it out in /etc/fstab and
rebooted to get a functioning system.
# uname -a
Linux nl40 5.10.0-2-amd64 #1 SMP Debian 5.10.9-1 (2021-01-20) x86_64 GNU/Linux
# mount -t btrfs /dev/sdd1 /mnt/sdd-tmp
mount: /mnt/sdd-tmp: wrong fs type, bad option, bad superblock on
/dev/sdd1, missing codepage or helper program, or other error.
# dmesg | grep -i btrfs
[ 5.799637] Btrfs loaded, crc32c=crc32c-generic
[ 6.428245] BTRFS: device label samba.btrfs devid 8 transid 1281994
/dev/sdb1 scanned by btrfs (172)
[ 6.428804] BTRFS: device label samba.btrfs devid 5 transid 1281994
/dev/sdd1 scanned by btrfs (172)
[ 6.429473] BTRFS: device label samba.btrfs devid 4 transid 1281994
/dev/sde1 scanned by btrfs (172)
[ 2004.140494] BTRFS info (device sde1): disk space caching is enabled
[ 2004.790843] BTRFS error (device sde1): super_total_bytes
22004298366976 mismatch with fs_devices total_rw_bytes 22004298370048
[ 2004.790854] BTRFS error (device sde1): failed to read chunk tree: -22
[ 2004.805043] BTRFS error (device sde1): open_ctree failed
Note that drive identifiers have changed between reboots. I haven't
seen that on this system before.
Questions
=========
Is btrfs rescue fix-device-size <device> considered the best way to
recover? Should I run that once for each device in the filesystem?
Do you want me to run any other commands to help diagnose the cause
before attempting recovery?
Unless I hear otherwise, I will probably attempt to reboot to Debian
kernel linux-image-5.10.0-1-amd64_5.10.4-1_amd64.deb tomorrow, to see
whether this is a kernel regression. (Probably 16-24 hours after I
send this).
Thanks for your work on linux and btrfs.
Best sincerely
Andrew V
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: super_total_bytes mismatch with fs_devices total_rw_bytes
2021-01-28 11:03 super_total_bytes mismatch with fs_devices total_rw_bytes Andrew Vaughan
@ 2021-01-28 11:15 ` Hugo Mills
0 siblings, 0 replies; 2+ messages in thread
From: Hugo Mills @ 2021-01-28 11:15 UTC (permalink / raw)
To: Andrew Vaughan; +Cc: Btrfs BTRFS
I'm not sure I'm confident enough to recommend a course of action
on this one, but one note from something you said:
On Thu, Jan 28, 2021 at 10:03:08PM +1100, Andrew Vaughan wrote:
[...]
> Today I did '# btrfs fi resize 4:max /srv/shared' as preparation for a
> balance to make the extra drive space available. (The old drives are
> all fairly full. About 130 GB free space on each. I initially tried
> btrfs fi resize max /srv/shared as the syntax on the manpage implies
> that devid is optional. Since that command errored, I assume it
> didn't change the filesystem).
The devid is indeed optional, but it then assumes that you mean
device 1 (which is what it is on a single-device FS). It looks like
your FS, for historical reasons, no longer has a device 1, hence the
error. That should be completely harmless.
[...]
> # uname -a
> Linux nl40 5.10.0-2-amd64 #1 SMP Debian 5.10.9-1 (2021-01-20) x86_64 GNU/Linux
>
> # mount -t btrfs /dev/sdd1 /mnt/sdd-tmp
> mount: /mnt/sdd-tmp: wrong fs type, bad option, bad superblock on
> /dev/sdd1, missing codepage or helper program, or other error.
>
> # dmesg | grep -i btrfs
> [ 5.799637] Btrfs loaded, crc32c=crc32c-generic
> [ 6.428245] BTRFS: device label samba.btrfs devid 8 transid 1281994
> /dev/sdb1 scanned by btrfs (172)
> [ 6.428804] BTRFS: device label samba.btrfs devid 5 transid 1281994
> /dev/sdd1 scanned by btrfs (172)
> [ 6.429473] BTRFS: device label samba.btrfs devid 4 transid 1281994
> /dev/sde1 scanned by btrfs (172)
> [ 2004.140494] BTRFS info (device sde1): disk space caching is enabled
> [ 2004.790843] BTRFS error (device sde1): super_total_bytes
> 22004298366976 mismatch with fs_devices total_rw_bytes 22004298370048
> [ 2004.790854] BTRFS error (device sde1): failed to read chunk tree: -22
> [ 2004.805043] BTRFS error (device sde1): open_ctree failed
>
> Note that drive identifiers have changed between reboots. I haven't
> seen that on this system before.
It happens sometimes. Sometimes between kernels, sometimes changed
hardware responds slightly faster than the previous device. Sometimes
devices get bumped along by having something new attached to an
earlier controller in the enumeration sequence. I've seen machines
that have had totally stable hardware for years suddenly decide to
flip enumeration order on one reboot. I wouldn't worry about it. :)
The good news is I don't see any of the usual horribly fatal error
messages here, so it's probably fixable.
> Questions
> =========
>
> Is btrfs rescue fix-device-size <device> considered the best way to
> recover? Should I run that once for each device in the filesystem?
I'm not confident enough to answer anything more than "probably" to
both of those.
> Do you want me to run any other commands to help diagnose the cause
> before attempting recovery?
Looks like a fairly complete report to me (but see above).
Hugo.
--
Hugo Mills | Be pure.
hugo@... carfax.org.uk | Be vigilant.
http://carfax.org.uk/ | Behave.
PGP: E2AB1DE4 | Torquemada, Nemesis
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2021-01-28 11:18 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-01-28 11:03 super_total_bytes mismatch with fs_devices total_rw_bytes Andrew Vaughan
2021-01-28 11:15 ` Hugo Mills
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox