xfs filesystem reports negative usage

All of lore.kernel.org
 help / color / mirror / Atom feed

* xfs filesystem reports negative usage - reoccurring problem
@ 2019-05-13  1:45 Tim Smith
  2019-05-13 14:09 ` Brian Foster
  2019-05-13 21:19 ` Dave Chinner
  0 siblings, 2 replies; 9+ messages in thread
From: Tim Smith @ 2019-05-13  1:45 UTC (permalink / raw)
  To: linux-xfs

Hey guys,

We've got a bunch of hosts with multiple spinning disks providing file
server duties with xfs.

Some of the filesystems will go into a state where they report
negative used space -  e.g. available is greater than total.

This appears to be purely cosmetic, as we can still write data to (and
read from) the filesystem, but it throws out our reporting data.

We can (temporarily) fix the issue by unmounting and running
`xfs_repair` on the filesystem, but it soon reoccurs.

Does anybody have any ideas as to why this might be happening and how
to prevent it? Can userspace processes affect change to the xfs
superblock?

Example of a 'good' filesystem on the host:

$ sudo df -k /dev/sdac
Filesystem      1K-blocks       Used  Available Use% Mounted on
/dev/sdac      9764349952 7926794452 1837555500  82% /srv/node/sdac

$ sudo strace df -k /dev/sdac |& grep statfs

statfs("/srv/node/sdac", {f_type=0x58465342, f_bsize=4096,
f_blocks=2441087488, f_bfree=459388875, f_bavail=459388875,
f_files=976643648, f_ffree=922112135, f_fsid={16832, 0},
f_namelen=255, f_frsize=4096, f_flags=3104}) = 0

$ sudo xfs_db -r /dev/sdac
[ snip ]
icount = 54621696
free = 90183
fdblocks = 459388955

Example of a 'bad' filesystem on the host:

$ sudo df -k /dev/sdad
Filesystem      1K-blocks        Used   Available Use% Mounted on
/dev/sdad      9764349952 -9168705440 18933055392    - /srv/node/sdad

$ sudo strace df -k /dev/sdad |& grep statfs
statfs("/srv/node/sdad", {f_type=0x58465342, f_bsize=4096,
f_blocks=2441087488, f_bfree=4733263848, f_bavail=4733263848,
f_files=976643648, f_ffree=922172221, f_fsid={16848, 0},
f_namelen=255, f_frsize=4096, f_flags=3104}) = 0

$ sudo xfs_db -r /dev/sdad
[ snip ]
icount = 54657600
ifree = 186173
fdblocks = 4733263928

Host environment:
$ uname -a
Linux hostname 4.15.0-47-generic #50~16.04.1-Ubuntu SMP Fri Mar 15
16:06:21 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.5 LTS
Release: 16.04
Codename: xenial

Thank you!
Tim

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs filesystem reports negative usage - reoccurring problem
  2019-05-13  1:45 xfs filesystem reports negative usage - reoccurring problem Tim Smith
@ 2019-05-13 14:09 ` Brian Foster
  2019-05-13 15:06   ` Eric Sandeen
  2019-05-20 23:36   ` Tim Smith
  2019-05-13 21:19 ` Dave Chinner
  1 sibling, 2 replies; 9+ messages in thread
From: Brian Foster @ 2019-05-13 14:09 UTC (permalink / raw)
  To: Tim Smith; +Cc: linux-xfs

On Mon, May 13, 2019 at 11:45:26AM +1000, Tim Smith wrote:
> Hey guys,
> 
> We've got a bunch of hosts with multiple spinning disks providing file
> server duties with xfs.
> 
> Some of the filesystems will go into a state where they report
> negative used space -  e.g. available is greater than total.
> 
> This appears to be purely cosmetic, as we can still write data to (and
> read from) the filesystem, but it throws out our reporting data.
> 
> We can (temporarily) fix the issue by unmounting and running
> `xfs_repair` on the filesystem, but it soon reoccurs.
> 
> Does anybody have any ideas as to why this might be happening and how
> to prevent it? Can userspace processes affect change to the xfs
> superblock?
> 

Hmm, I feel like there have been at least a few fixes for similar
symptoms over the past few releases. It might be hard to pinpoint one
unless somebody more familiar with this problem comes across this.

FWIW, something like commit aafe12cee0 ("xfs: don't trip over negative
free space in xfs_reserve_blocks") looks like it could cause this kind
of wonky accounting, but that's just a guess from skimming the patch
log. I have no idea if you'd be affected by this.

> Example of a 'good' filesystem on the host:
> 
> $ sudo df -k /dev/sdac
> Filesystem      1K-blocks       Used  Available Use% Mounted on
> /dev/sdac      9764349952 7926794452 1837555500  82% /srv/node/sdac
> 
> $ sudo strace df -k /dev/sdac |& grep statfs
> 
> statfs("/srv/node/sdac", {f_type=0x58465342, f_bsize=4096,
> f_blocks=2441087488, f_bfree=459388875, f_bavail=459388875,
> f_files=976643648, f_ffree=922112135, f_fsid={16832, 0},
> f_namelen=255, f_frsize=4096, f_flags=3104}) = 0
> 
> $ sudo xfs_db -r /dev/sdac
> [ snip ]
> icount = 54621696
> free = 90183
> fdblocks = 459388955
> 
> Example of a 'bad' filesystem on the host:
> 
> $ sudo df -k /dev/sdad
> Filesystem      1K-blocks        Used   Available Use% Mounted on
> /dev/sdad      9764349952 -9168705440 18933055392    - /srv/node/sdad
> 
> $ sudo strace df -k /dev/sdad |& grep statfs
> statfs("/srv/node/sdad", {f_type=0x58465342, f_bsize=4096,
> f_blocks=2441087488, f_bfree=4733263848, f_bavail=4733263848,
> f_files=976643648, f_ffree=922172221, f_fsid={16848, 0},
> f_namelen=255, f_frsize=4096, f_flags=3104}) = 0
> 

It looks like you end up somehow having a huge free block count, larger
even than the total block count. The 'used' value reported by userspace
ends up being f_blocks - f_bfree, hence the negative value.

> $ sudo xfs_db -r /dev/sdad
> [ snip ]
> icount = 54657600
> ifree = 186173
> fdblocks = 4733263928
> 
> Host environment:
> $ uname -a
> Linux hostname 4.15.0-47-generic #50~16.04.1-Ubuntu SMP Fri Mar 15
> 16:06:21 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
> 

Could you also include xfs_info and mount params of the filesystem(s) in
question?

Also, is this negative blocks used state persistent for any of these
filesystems? IOW, if you unmount/mount, are you right back into this
state, or does accounting start off sane and fall into this bogus state
after a period of runtime or due to some unknown operation?

If the former, the next best step might be to try a filesystem on a more
recent kernel and determine whether this problem is already fixed one
way or another. Note that this could be easily done on a
development/test system with an xfs_metadump image of the fs if you
didn't want to muck around with production systems.

Brian

> $ lsb_release -a
> No LSB modules are available.
> Distributor ID: Ubuntu
> Description: Ubuntu 16.04.5 LTS
> Release: 16.04
> Codename: xenial
> 
> Thank you!
> Tim

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs filesystem reports negative usage - reoccurring problem
  2019-05-13 14:09 ` Brian Foster
@ 2019-05-13 15:06   ` Eric Sandeen
  2019-05-20 23:39     ` Tim Smith
  2019-05-20 23:36   ` Tim Smith
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Sandeen @ 2019-05-13 15:06 UTC (permalink / raw)
  To: Brian Foster, Tim Smith; +Cc: linux-xfs

On 5/13/19 9:09 AM, Brian Foster wrote:
> We can (temporarily) fix the issue by unmounting and running
> `xfs_repair` on the filesystem, but it soon reoccurs.

I'm kind of interested in what xfs_repair finds in this case.

However, 4.15 is about a year an a half old, so this list may not be
the best place for support.

> $ lsb_release -a
> No LSB modules are available.
> Distributor ID: Ubuntu
> Description: Ubuntu 16.04.5 LTS
> Release: 16.04
> Codename: xenial

LTS is "Long Term Support" right?  So I'd suggest reaching out to your
distribution for assistance unless you can demonstrate the problem
on a current upstream kernel.

-Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs filesystem reports negative usage - reoccurring problem
  2019-05-13  1:45 xfs filesystem reports negative usage - reoccurring problem Tim Smith
  2019-05-13 14:09 ` Brian Foster
@ 2019-05-13 21:19 ` Dave Chinner
  2019-05-20 23:41   ` Tim Smith
  1 sibling, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2019-05-13 21:19 UTC (permalink / raw)
  To: Tim Smith; +Cc: linux-xfs

On Mon, May 13, 2019 at 11:45:26AM +1000, Tim Smith wrote:
> Hey guys,
> 
> We've got a bunch of hosts with multiple spinning disks providing file
> server duties with xfs.
> 
> Some of the filesystems will go into a state where they report
> negative used space -  e.g. available is greater than total.
> 
> This appears to be purely cosmetic, as we can still write data to (and
> read from) the filesystem, but it throws out our reporting data.
> 
> We can (temporarily) fix the issue by unmounting and running
> `xfs_repair` on the filesystem, but it soon reoccurs.
....
> Example of a 'good' filesystem on the host:
.....
> fdblocks = 459388955
> 
> Example of a 'bad' filesystem on the host:
.....
> fdblocks = 4733263928


  decimal	 hex
 459388955	 1b61b7cb
4733263928	11a1fdfe8
                ^
		Single bit is wrong in the free block count.

IOWs, I'd say there's single bit errors happening somewhere in your
system. Whether it be memory corruption, machines being rowhammered,
uncorrected storage media errors, etc I have no idea. But it seems
suspicious that the free block count is almost exactly 0x100000000
out....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs filesystem reports negative usage - reoccurring problem
  2019-05-13 14:09 ` Brian Foster
  2019-05-13 15:06   ` Eric Sandeen
@ 2019-05-20 23:36   ` Tim Smith
  1 sibling, 0 replies; 9+ messages in thread
From: Tim Smith @ 2019-05-20 23:36 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Tue, May 14, 2019 at 12:09 AM Brian Foster <bfoster@redhat.com> wrote:
> Could you also include xfs_info and mount params of the filesystem(s) in
> question?

$ sudo mount | grep sdad
/dev/sdad on /srv/node/sdad type xfs
(rw,noatime,nodiratime,attr2,inode64,logbufs=8,noquota)

$ sudo xfs_info /srv/node/sdad
meta-data=/dev/sdad              isize=512    agcount=10, agsize=268435455 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1 spinodes=0
data     =                       bsize=4096   blocks=2441609216, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=521728, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

$ sudo xfs_info /srv/node/sdac
meta-data=/dev/sdac              isize=512    agcount=10, agsize=268435455 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1 spinodes=0
data     =                       bsize=4096   blocks=2441609216, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=521728, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

> Also, is this negative blocks used state persistent for any of these
> filesystems? IOW, if you unmount/mount, are you right back into this
> state, or does accounting start off sane and fall into this bogus state
> after a period of runtime or due to some unknown operation?

It's persistent. After umount/remount, it's still in the same state.
It seems to happen after some time...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs filesystem reports negative usage - reoccurring problem
  2019-05-13 15:06   ` Eric Sandeen
@ 2019-05-20 23:39     ` Tim Smith
  2019-05-21  1:43       ` Dave Chinner
  0 siblings, 1 reply; 9+ messages in thread
From: Tim Smith @ 2019-05-20 23:39 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Brian Foster, linux-xfs

On Tue, May 14, 2019 at 1:06 AM Eric Sandeen <sandeen@sandeen.net> wrote:
> I'm kind of interested in what xfs_repair finds in this case.

$ sudo xfs_repair -m 4096 -v /dev/sdad
Phase 1 - find and verify superblock...
        - block cache size set to 342176 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 159752 tail block 159752
        - scan filesystem freespace and inode maps...
sb_fdblocks 4725279343, counted 430312047
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
Phase 5 - rebuild AG headers and trees...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...

        XFS_REPAIR Summary    Mon May 20 18:53:30 2019

Phase           Start           End             Duration
Phase 1:        05/20 10:49:27  05/20 10:49:27
Phase 2:        05/20 10:49:27  05/20 10:50:05  38 seconds
Phase 3:        05/20 10:50:05  05/20 15:24:34  4 hours, 34 minutes, 29 seconds
Phase 4:        05/20 15:24:34  05/20 17:08:23  1 hour, 43 minutes, 49 seconds
Phase 5:        05/20 17:08:23  05/20 17:08:25  2 seconds
Phase 6:        05/20 17:08:25  05/20 18:53:30  1 hour, 45 minutes, 5 seconds
Phase 7:        05/20 18:53:30  05/20 18:53:30

Total run time: 8 hours, 4 minutes, 3 seconds

done

> However, 4.15 is about a year an a half old, so this list may not be
> the best place for support.
> ...
> LTS is "Long Term Support" right?  So I'd suggest reaching out to your
> distribution for assistance unless you can demonstrate the problem
> on a current upstream kernel.

Good point. I appreciate your assistance nonetheless :)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs filesystem reports negative usage - reoccurring problem
  2019-05-13 21:19 ` Dave Chinner
@ 2019-05-20 23:41   ` Tim Smith
  0 siblings, 0 replies; 9+ messages in thread
From: Tim Smith @ 2019-05-20 23:41 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Tue, May 14, 2019 at 7:19 AM Dave Chinner <david@fromorbit.com> wrote:
>   decimal        hex
>  459388955       1b61b7cb
> 4733263928      11a1fdfe8
>                 ^
>                 Single bit is wrong in the free block count.
>
> IOWs, I'd say there's single bit errors happening somewhere in your
> system. Whether it be memory corruption, machines being rowhammered,
> uncorrected storage media errors, etc I have no idea. But it seems
> suspicious that the free block count is almost exactly 0x100000000
> out....

This issue is happening on more than one host, so I'm guessing that it
might not be a RAM issue... however, all the incorrect fdblocks values
do start with a 1:

server2 /dev/sdm
fdblocks = 4674169069 (0x1169A28ED)
server3 /dev/sdad
fdblocks = 4722598181 (0x1197D2125)
server4 /dev/sdad
fdblocks = 4708207408 (0x118A18B30)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs filesystem reports negative usage - reoccurring problem
  2019-05-20 23:39     ` Tim Smith
@ 2019-05-21  1:43       ` Dave Chinner
  2019-05-21  2:10         ` Tim Smith
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Chinner @ 2019-05-21  1:43 UTC (permalink / raw)
  To: Tim Smith; +Cc: Eric Sandeen, Brian Foster, linux-xfs

On Tue, May 21, 2019 at 09:39:02AM +1000, Tim Smith wrote:
> On Tue, May 14, 2019 at 1:06 AM Eric Sandeen <sandeen@sandeen.net> wrote:
> > I'm kind of interested in what xfs_repair finds in this case.
> 
> $ sudo xfs_repair -m 4096 -v /dev/sdad
> Phase 1 - find and verify superblock...
>         - block cache size set to 342176 entries
> Phase 2 - using internal log
>         - zero log...
> zero_log: head block 159752 tail block 159752
>         - scan filesystem freespace and inode maps...
> sb_fdblocks 4725279343, counted 430312047

$ printf %x 4725279343
119a60a6f
$ printf %x 430312047
 19a60a6f

You definitely have uncorrected single bit errors occuring 
on your systems.

If the filesystem was writing this bad fdblock count to disk, then
xfs_validate_sb_write() would be firing this warning:

	xfs_warn(mp, "SB summary counter sanity check failed");

when the superblock is written back on unmount. That write would
then fail, and that would leave the log dirty. Then after log
recovery we'd rebuild the counters from the AGFs because it wasn't a
clean unmount, and the problem would go away. If the log was clean,
then we'd see that the fdblocks count was invalid, and we'd rebuild
the counters from the AGFs and the problem would go away.

But you are saying that unmount/mount doesn't fix it, which means
you must be running a sufficiently old kernel that it doesn't detect
these conditions, issue warnings and automatically repair itself.
Yup:

8756a5af1819 ("libxfs: add more bounds checking to sb sanity checks")
2e9e6481e2a7 ("xfs: detect and fix bad summary counts at mount")

were both merged in 4.19. Well, that would explain why you aren't
seeing warnings or having it fixed automatically on detection.

IOWs, whatever the cause of your single bit error is, I don't know,
but it would seem that recent kernels will detect the condition and
automatically fix themselves at mount time.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs filesystem reports negative usage - reoccurring problem
  2019-05-21  1:43       ` Dave Chinner
@ 2019-05-21  2:10         ` Tim Smith
  0 siblings, 0 replies; 9+ messages in thread
From: Tim Smith @ 2019-05-21  2:10 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Eric Sandeen, Brian Foster, linux-xfs

On Tue, May 21, 2019 at 11:43 AM Dave Chinner <david@fromorbit.com> wrote:
> 8756a5af1819 ("libxfs: add more bounds checking to sb sanity checks")
> 2e9e6481e2a7 ("xfs: detect and fix bad summary counts at mount")
>
> were both merged in 4.19. Well, that would explain why you aren't
> seeing warnings or having it fixed automatically on detection.
>
> IOWs, whatever the cause of your single bit error is, I don't know,
> but it would seem that recent kernels will detect the condition and
> automatically fix themselves at mount time.

This kit is in a legacy environment to be (eventually) decommissioned,
so I'll patch the kernel to work around the issue until we can put the
hardware to bed.

Absolute legend! Thank you for all your help!

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-05-21  2:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-13  1:45 xfs filesystem reports negative usage - reoccurring problem Tim Smith
2019-05-13 14:09 ` Brian Foster
2019-05-13 15:06   ` Eric Sandeen
2019-05-20 23:39     ` Tim Smith
2019-05-21  1:43       ` Dave Chinner
2019-05-21  2:10         ` Tim Smith
2019-05-20 23:36   ` Tim Smith
2019-05-13 21:19 ` Dave Chinner
2019-05-20 23:41   ` Tim Smith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.