From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Elana Hashman <Elana.Hashman@twosigma.com>
Cc: "'tytso@mit.edu'" <tytso@mit.edu>,
"'linux-ext4@vger.kernel.org'" <linux-ext4@vger.kernel.org>
Subject: Re: Phantom full ext4 root filesystems on 4.1 through 4.14 kernels
Date: Thu, 8 Nov 2018 10:47:22 -0800 [thread overview]
Message-ID: <20181108184722.GB27852@magnolia> (raw)
In-Reply-To: <9abbdde6145a4887a8d32c65974f7832@exmbdft5.ad.twosigma.com>
On Thu, Nov 08, 2018 at 05:59:18PM +0000, Elana Hashman wrote:
> Hi Ted,
>
> We've run into a mysterious "phantom" full filesystem issue on our Kubernetes fleet. We initially encountered this issue on kernel 4.1.35, but are still experiencing the problem after upgrading to 4.14.67. Essentially, `df` reports our root filesystems as full and they behave as though they are full, but the "used" space cannot be accounted for. Rebooting the system, remounting the root filesystem read-only and then remounting as read-write, or booting into single-user mode all free up the "used" space. The disk slowly fills up over time, suggesting that there might be some kind of leak; we previously saw this affecting hosts with ~200 days of uptime on the 4.1 kernel, but are now seeing it affect a 4.14 host with only ~70 days of uptime.
>
> Here is some data from an example host, running the 4.14.67 kernel. The root disk is ext4.
>
> $ uname -a
> Linux <hostname> 4.14.67-ts1 #1 SMP Wed Aug 29 13:28:25 UTC 2018 x86_64 GNU/Linux
> $ grep ' / ' /proc/mounts
> /dev/disk/by-uuid/<some-uuid> / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0
>
> `df` reports 0 bytes free:
>
> $ df -h /
> Filesystem Size Used Avail Use% Mounted on
> /dev/disk/by-uuid/<some-uuid> 50G 48G 0 100% /
This is very odd. I wonder, how many of those overlayfses are still
mounted on the system at this point? Over in xfs land we've discovered
that overlayfs subtly changes the lifetime behavior of incore inodes,
maybe that's what's going on here? (Pure speculation on my part...)
> Deleted, open files account for almost no disk capacity:
>
> $ sudo lsof -a +L1 /
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME
> java 5313 user 3r REG 8,3 6806312 0 1315847 /var/lib/sss/mc/passwd (deleted)
> java 5313 user 11u REG 8,3 55185 0 2494654 /tmp/classpath.1668Gp (deleted)
> system_ar 5333 user 3r REG 8,3 6806312 0 1315847 /var/lib/sss/mc/passwd (deleted)
> java 5421 user 3r REG 8,3 6806312 0 1315847 /var/lib/sss/mc/passwd (deleted)
> java 5421 user 11u REG 8,3 149313 0 2494486 /tmp/java.fzTwWp (deleted)
> java 5421 tsdist 12u REG 8,3 55185 0 2500513 /tmp/classpath.7AmxHO (deleted)
>
> `du` can only account for 16GB of file usage:
>
> $ sudo du -hxs /
> 16G /
>
> But what is most puzzling is the numbers reported by e2freefrag, which don't add up:
>
> $ sudo e2freefrag /dev/disk/by-uuid/<some-uuid>
> Device: /dev/disk/by-uuid/<some-uuid>
> Blocksize: 4096 bytes
> Total blocks: 13107200
> Free blocks: 7778076 (59.3%)
>
> Min. free extent: 4 KB
> Max. free extent: 8876 KB
> Avg. free extent: 224 KB
> Num. free extent: 6098
>
> HISTOGRAM OF FREE EXTENT SIZES:
> Extent Size Range : Free extents Free Blocks Percent
> 4K... 8K- : 1205 1205 0.02%
> 8K... 16K- : 980 2265 0.03%
> 16K... 32K- : 653 3419 0.04%
> 32K... 64K- : 1337 15374 0.20%
> 64K... 128K- : 631 14151 0.18%
> 128K... 256K- : 224 10205 0.13%
> 256K... 512K- : 261 23818 0.31%
> 512K... 1024K- : 303 56801 0.73%
> 1M... 2M- : 387 135907 1.75%
> 2M... 4M- : 103 64740 0.83%
> 4M... 8M- : 12 15005 0.19%
> 8M... 16M- : 2 4267 0.05%
Clearly a bug in e2freefrag, the percentages are supposed to sum to 100.
Patches soon.
--D
>
> This looks like a bug to me; the histogram in the manpage example has percentages that add up to 100% but this doesn't even add up to 5%.
>
> After a reboot, `df` reflects real utilization:
>
> $ df -h /
> Filesystem Size Used Avail Use% Mounted on
> /dev/disk/by-uuid/<some-uuid> 50G 16G 31G 34% /
>
> We are using overlay2fs for Docker, as well as rbd mounts; I'm not sure how they might interact.
>
> Thanks for your help,
>
> --
> Elana Hashman
> ehashman@twosigma.com
next prev parent reply other threads:[~2018-11-09 4:24 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-08 17:59 Phantom full ext4 root filesystems on 4.1 through 4.14 kernels Elana Hashman
2018-11-08 18:13 ` Reindl Harald
2018-11-08 18:20 ` Elana Hashman
2018-11-08 18:47 ` Darrick J. Wong [this message]
2018-12-05 16:26 ` Elana Hashman
2019-01-23 19:59 ` Thomas Walker
2019-06-26 15:17 ` Thomas Walker
2019-07-11 9:23 ` Jan Kara
2019-07-11 14:40 ` Geoffrey Thomas
2019-07-11 15:23 ` Jan Kara
2019-07-11 17:10 ` Theodore Ts'o
2019-07-12 19:19 ` Thomas Walker
2019-07-12 20:28 ` Theodore Ts'o
2019-07-12 21:47 ` Geoffrey Thomas
2019-07-25 21:22 ` Geoffrey Thomas
2019-07-29 10:09 ` Jan Kara
2019-07-29 11:18 ` ext4 file system is constantly writing to the block device with no activity from the applications, is it a bug? Dmitrij Gusev
2019-07-29 12:55 ` Theodore Y. Ts'o
2019-07-29 21:12 ` Dmitrij Gusev
2019-01-24 1:54 ` Phantom full ext4 root filesystems on 4.1 through 4.14 kernels Liu Bo
2019-01-24 14:40 ` Elana Hashman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181108184722.GB27852@magnolia \
--to=darrick.wong@oracle.com \
--cc=Elana.Hashman@twosigma.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox