* Bug? or normal behavior? if bug, then where? overlay, vfs, xfs, or ????
@ 2017-11-01 22:03 L A Walsh
2017-11-02 7:03 ` Amir Goldstein
0 siblings, 1 reply; 10+ messages in thread
From: L A Walsh @ 2017-11-01 22:03 UTC (permalink / raw)
To: linux-unionfs
Trying the overlay fs for first time and am wondering about normal behavior.
After modifying file 'noise' and deleting "file1" in a merged directory, I
get:
> ls -lgG
ls: cannot access 'file1': No such file or directory
total 26858576
?????????? ? ? ? file1
-rwxrwxr-x 1 9167721042 Oct 4 09:45 file2*
-rwxrwxr-x 1 9167721042 Oct 5 20:09 file3*
-rwxrwxr-x 1 9167721042 Nov 13 2011 file4*
-rwxrwxr-x 1 496 Oct 28 15:13 noise*
-rwxrwxr-x 1 452 Oct 5 20:08 noise.orig*
drwxrwxr-x 2 18 Oct 4 09:41 src/
I'm a bit concerned about the "white-out" for "file1".
Is this how it is supposed to appear? Should I file
a bug in the kernel's bugzilla-db?
How I got here:
My kernel from "uname -a" is:
Linux Ishtar 4.13.9-Isht-Van #1 SMP Thu Oct 26 16:41:08 PDT 2017 x86_64
GNU/Linux
I used a pre-existing directory "/local/test" as a 'lower':
> /bin/ls -lgGR /local/test
/local/test:
total 35811432
-rwxrwxr-x 1 9167721042 Nov 13 2011 file1
-rwxrwxr-x 1 9167721042 Oct 4 09:45 file2
-rwxrwxr-x 1 9167721042 Oct 5 20:09 file3
-rwxrwxr-x 1 9167721042 Nov 13 2011 file4
-rwxrwxr-x 1 452 Oct 5 20:08 noise
-rwxrwxr-x 1 420 Oct 5 19:57 noise.orig
drwxrwxr-x 2 18 Oct 4 09:41 src
/local/test/src:
total 8952856
-rwxrwxr-x 1 9167721042 Nov 13 2011 file1
in a pre-existing 'xfs' file system:
> xfs_info /local/test
meta-data=/dev/Data/Local isize=256 agcount=32,
agsize=12582896 blks = sectsz=4096 attr=2
data = bsize=4096 blocks=402652672,imaxpct=10
= sunit=16 swidth=16 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=32768, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
I then created a new xfs file system and mounted it on '/edge';
Ishtar:/edge> xfs_info .
meta-data=/dev/Data/Edge isize=256 agcount=32,
agsize=16777200 blks = sectsz=4096 attr=2
data = bsize=4096 blocks=536870400, imaxpct=5
= sunit=16 swidth=64 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=262143, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
created directories in it;
> cd /edge
> mkdir merged overlays overlays/upper work
and mounted an overlay fs on "/edge/merged" with:
sudo mount -t overlay none -olowerdir=/local/test,\
upperdir=/edge/overlays/upper,\
workdir=/edge/work /edge/merged
After editing 'noise' and removing 'file1', I got the listing
at the top. The 'file1' in the top listing can't be deleted.
It is only present/visible in the 'merged' directory, but
does seem to make the "overlayfs" unusable for general
purposes, so I'm guessing it's a bug?
Of note: a **likely** Red-Herring, is this RH bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1319507
(I say red-herring, as I'm not using RH, and there is
no real data in the bug other than similar symptoms
running over xfs).
BTW -- is the setup in that bug report even "valid"? I.e. using
the same single-underlying file system for all 4 directories?
Thanks!
-linda
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: Bug? or normal behavior? if bug, then where? overlay, vfs, xfs, or ???? 2017-11-01 22:03 Bug? or normal behavior? if bug, then where? overlay, vfs, xfs, or ???? L A Walsh @ 2017-11-02 7:03 ` Amir Goldstein 2017-11-02 21:57 ` L A Walsh 2017-11-05 8:17 ` L A Walsh 0 siblings, 2 replies; 10+ messages in thread From: Amir Goldstein @ 2017-11-02 7:03 UTC (permalink / raw) To: L A Walsh; +Cc: overlayfs On Thu, Nov 2, 2017 at 12:03 AM, L A Walsh <lkml@tlinx.org> wrote: > Trying the overlay fs for first time and am wondering about normal behavior. > Linda, Thanks for the detailed report. This is an overlayfs behavior with certain "old" file systems (see below). > After modifying file 'noise' and deleting "file1" in a merged directory, I > get: > > > ls -lgG > ls: cannot access 'file1': No such file or directory > total 26858576 > ?????????? ? ? ? file1 > -rwxrwxr-x 1 9167721042 Oct 4 09:45 file2* > -rwxrwxr-x 1 9167721042 Oct 5 20:09 file3* > -rwxrwxr-x 1 9167721042 Nov 13 2011 file4* > -rwxrwxr-x 1 496 Oct 28 15:13 noise* > -rwxrwxr-x 1 452 Oct 5 20:08 noise.orig* > drwxrwxr-x 2 18 Oct 4 09:41 src/ > > > I'm a bit concerned about the "white-out" for "file1". Is this how it is > supposed to appear? Should I file > a bug in the kernel's bugzilla-db? > Whiteout certainly shouldn't appear that way. The reason it does is that your upper fs does not support "d_type" (see below). It's a "known" issue, but don't know where/if it is documented. I expect if you look in dmesg, you will see this warning: "overlayfs: upper fs needs to support d_type." Somewhat cryptic message - I agree. For backward compatibility overlayfs mount does not fail, but leaves an overlay mount with the flaw you described. It's actually very good that you pointed this out, because I now realize we should hard enforce d_type support for NFS export. We also do not check for lower fs d_type support. That can also expose old whiteouts in certain setups. .. > > > I then created a new xfs file system and mounted it on '/edge'; > > Ishtar:/edge> xfs_info . > meta-data=/dev/Data/Edge isize=256 agcount=32, > agsize=16777200 blks = sectsz=4096 attr=2 > data = bsize=4096 blocks=536870400, imaxpct=5 > = sunit=16 swidth=64 blks > naming =version 2 bsize=4096 ascii-ci=0 > log =internal bsize=4096 blocks=262143, version=2 > = sectsz=4096 sunit=1 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > Your problem is that you do not have "ftype" feature in directory name format, like this: naming =version 2 bsize=4096 ascii-ci=0 ftype=1 Perhaps you have an old version of mkfs.xfs, not sure when ftype=1 became the default format, but you can try to mkfs.xfs -n ftype=1 and follow the breadcrumbs from there. ... > > BTW -- is the setup in that bug report even "valid"? I.e. using > the same single-underlying file system for all 4 directories? > Yes. Actually your setup uses 2 different file system instances for lower and upper, which is fine, but it is perfectly valid, quite common and even has some advantages to use upper/lower on the same underlying filesystem instance. The location of the 4th directory (/edge/merged) does not matter. Amir. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug? or normal behavior? if bug, then where? overlay, vfs, xfs, or ???? 2017-11-02 7:03 ` Amir Goldstein @ 2017-11-02 21:57 ` L A Walsh 2017-11-03 6:45 ` Amir Goldstein 2017-11-05 8:17 ` L A Walsh 1 sibling, 1 reply; 10+ messages in thread From: L A Walsh @ 2017-11-02 21:57 UTC (permalink / raw) To: Amir Goldstein; +Cc: overlayfs Amir Goldstein wrote: > > Whiteout certainly shouldn't appear that way. > (thank goodness!) > The reason it does is that your upper fs does not support > "d_type" (see below). > It's a "known" issue, but don't know where/if it is documented. > > I expect if you look in dmesg, you will see this warning: > "overlayfs: upper fs needs to support d_type." > ---- Yup...found that. > We also do not check for lower fs d_type support. > That can also expose old whiteouts in certain setups. > ---- *ouch*. I wonder if d_type can be set for existing file systems. I easily have some file systems that date back more than a few years. >> I then created a new xfs file system and mounted it on '/edge'; >> >> Ishtar:/edge> xfs_info .... >> naming =version 2 bsize=4096 ascii-ci=0 >> > > Your problem is that you do not have "ftype" feature in directory > name format, like this: > > naming =version 2 bsize=4096 ascii-ci=0 ftype=1 > --- My mkfs.xfs is a few (3) years old. > Perhaps you have an old version of mkfs.xfs, not sure when > ftype=1 became the default format, but you can try to > mkfs.xfs -n ftype=1 > and follow the breadcrumbs from there. > > ... > >> BTW -- is the setup in that bug report even "valid"? I.e. using >> the same single-underlying file system for all 4 directories? >> >> > > Yes. Actually your setup uses 2 different file system instances > for lower and upper, which is fine, but it is perfectly valid, quite > common and even has some advantages to use upper/lower > on the same underlying filesystem instance. > ---- I was referring to the RH bug report where they had created everything on 1 FS. I wondered about upper+lower overlap problems on the same fs. I'd think that could get a bit tangled Thanks... will look for a newer mkfs.xfs. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug? or normal behavior? if bug, then where? overlay, vfs, xfs, or ???? 2017-11-02 21:57 ` L A Walsh @ 2017-11-03 6:45 ` Amir Goldstein 0 siblings, 0 replies; 10+ messages in thread From: Amir Goldstein @ 2017-11-03 6:45 UTC (permalink / raw) To: L A Walsh; +Cc: overlayfs On Thu, Nov 2, 2017 at 11:57 PM, L A Walsh <lkml@tlinx.org> wrote: > Amir Goldstein wrote: >> >> >> Whiteout certainly shouldn't appear that way. >> > > (thank goodness!) > >> The reason it does is that your upper fs does not support >> "d_type" (see below). >> It's a "known" issue, but don't know where/if it is documented. >> >> I expect if you look in dmesg, you will see this warning: >> "overlayfs: upper fs needs to support d_type." >> > > ---- > Yup...found that. >> >> We also do not check for lower fs d_type support. >> That can also expose old whiteouts in certain setups. >> > > ---- *ouch*. I wonder if d_type can be set for existing file systems. > I easily have some file systems that date back more than a few years. > Don't think you need to worry about that. The corner case where d_type matters on lower fs are probably not relevant for your use case. It only matters if you are using a directory that was once an upper layer as a lower layer (i.e. stacking of layers) and that layer already has whiteouts in it. ... >> >>> >>> BTW -- is the setup in that bug report even "valid"? I.e. using >>> the same single-underlying file system for all 4 directories? >>> >>> >> >> >> Yes. Actually your setup uses 2 different file system instances >> for lower and upper, which is fine, but it is perfectly valid, quite >> common and even has some advantages to use upper/lower >> on the same underlying filesystem instance. >> > > ---- > I was referring to the RH bug report where they had created > everything on 1 FS. Doesn't matter to the issue at hand. Bug happens when upper fs has no d_type support regardless of lower fs is the same or not. > I wondered about upper+lower overlap problems > on the same fs. I'd think that could get a bit tangled > Not sure what exactly you are referring to, but if you are thinking about using lower/upper directories that overlap - don't! Overlayfs will not prevent you from shooting your own foot. Amir. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug? or normal behavior? if bug, then where? overlay, vfs, xfs, or ???? 2017-11-02 7:03 ` Amir Goldstein 2017-11-02 21:57 ` L A Walsh @ 2017-11-05 8:17 ` L A Walsh 2017-11-05 8:55 ` Amir Goldstein 1 sibling, 1 reply; 10+ messages in thread From: L A Walsh @ 2017-11-05 8:17 UTC (permalink / raw) To: Amir Goldstein; +Cc: overlayfs Amir Goldstein wrote: > > >> I then created a new xfs file system and mounted it on '/edge'; >> >> Ishtar:/edge> xfs_info . >> meta-data=/dev/Data/Edge isize=256 agcount=32, >> agsize=16777200 blks = sectsz=4096 attr=2 >> data = bsize=4096 blocks=536870400, imaxpct=5 >> = sunit=16 swidth=64 blks >> naming =version 2 bsize=4096 ascii-ci=0 >> log =internal bsize=4096 blocks=262143, version=2 >> = sectsz=4096 sunit=1 blks, lazy-count=1 >> realtime =none extsz=4096 blocks=0, rtextents=0 >> >> > > Your problem is that you do not have "ftype" feature in directory > name format, like this: > > naming =version 2 bsize=4096 ascii-ci=0 ftype=1 > > Perhaps you have an old version of mkfs.xfs, not sure when > ftype=1 became the default format, but you can try to > mkfs.xfs -n ftype=1 > ---- Ah... no .. last I was told, if you turned on ftype=1, you had to also pull in crc'ing of all the meta-info. That has problems -- causes errors where there would be no problem, and was never tested on mature file systems that were already fragmented. Do you know if it was separated from crc32 -- for some inexplicable reason, if you wanted ftype, then the crc option would be forced on for you. I didn't want it as I didn't want it to flag errors in metadata that wasn't crucial and didn't want the speed slowdown. Sigh. The problem on crc'ing the meta data, is that there is ALOT more meta data where detecting it will do more harm than good (like what nanosecond the file was last changed, for example). I first ran into it taking the disk offline when I changed the guid on a newly formatted disk. That was fixed, but that was a warning shot... How annoying. From what you say, though only the upper layer needs to have the ftype=1. That's a new filesystem, so shouldn't make that much difference, but the lower fs's I'd want to use overlays with are older file systems. But it sounds like those can remain as they are? (assuming they don't become upper layers in some multi-layer scenario)... ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug? or normal behavior? if bug, then where? overlay, vfs, xfs, or ???? 2017-11-05 8:17 ` L A Walsh @ 2017-11-05 8:55 ` Amir Goldstein 2017-11-05 22:34 ` Dave Chinner 0 siblings, 1 reply; 10+ messages in thread From: Amir Goldstein @ 2017-11-05 8:55 UTC (permalink / raw) To: L A Walsh; +Cc: overlayfs, linux-xfs, Dave Chinner, Darrick J. Wong [adding cc: linux-xfs] On Sun, Nov 5, 2017 at 10:17 AM, L A Walsh <lkml@tlinx.org> wrote: > Amir Goldstein wrote: >> >> >> >>> >>> I then created a new xfs file system and mounted it on '/edge'; >>> >>> Ishtar:/edge> xfs_info . >>> meta-data=/dev/Data/Edge isize=256 agcount=32, >>> agsize=16777200 blks = sectsz=4096 attr=2 >>> data = bsize=4096 blocks=536870400, imaxpct=5 >>> = sunit=16 swidth=64 blks >>> naming =version 2 bsize=4096 ascii-ci=0 >>> log =internal bsize=4096 blocks=262143, version=2 >>> = sectsz=4096 sunit=1 blks, lazy-count=1 >>> realtime =none extsz=4096 blocks=0, rtextents=0 >>> >>> >> >> >> Your problem is that you do not have "ftype" feature in directory >> name format, like this: >> >> naming =version 2 bsize=4096 ascii-ci=0 ftype=1 >> >> Perhaps you have an old version of mkfs.xfs, not sure when >> ftype=1 became the default format, but you can try to >> mkfs.xfs -n ftype=1 >> > > ---- Ah... no .. last I was told, if you turned on ftype=1, > you had to also pull in crc'ing of all the meta-info. > That has problems -- causes errors where there would be no > problem, and was never tested on mature file systems that were > already fragmented. > > > Do you know if it was separated from crc32 -- for some inexplicable reason, > if you wanted ftype, then the crc option would be forced on for you. > I don't know if there was a specific reason, but that's the way it is. > I didn't want it as I didn't want it to flag errors in metadata that > wasn't crucial and didn't want the speed slowdown. Sigh. > > The problem on crc'ing the meta data, is that there is ALOT more meta > data where detecting it will do more harm than good (like what nanosecond > the file was last changed, for example). I first ran into it > taking the disk offline when I changed the guid on a newly formatted disk. > That was fixed, but that was a warning shot... How annoying. > I have never heard about those issues that you raise. It sounds like a myth about XFS metadata CRC that should be debunked so forwarding your message on to XFS list. See also https://www.spinics.net/lists/xfs/msg19079.html > > From what you say, though only the upper layer needs to have the ftype=1. > That's a new filesystem, so shouldn't make that much difference, but the > lower fs's I'd want to use overlays with are older file systems. But > it sounds like those can remain as they are? > > (assuming they don't become upper layers in some multi-layer > scenario)... > That is correct. Amir. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug? or normal behavior? if bug, then where? overlay, vfs, xfs, or ???? 2017-11-05 8:55 ` Amir Goldstein @ 2017-11-05 22:34 ` Dave Chinner 2017-11-08 21:21 ` L A Walsh 0 siblings, 1 reply; 10+ messages in thread From: Dave Chinner @ 2017-11-05 22:34 UTC (permalink / raw) To: Amir Goldstein; +Cc: L A Walsh, overlayfs, linux-xfs, Darrick J. Wong On Sun, Nov 05, 2017 at 10:55:40AM +0200, Amir Goldstein wrote: > [adding cc: linux-xfs] > > On Sun, Nov 5, 2017 at 10:17 AM, L A Walsh <lkml@tlinx.org> wrote: > > Amir Goldstein wrote: > >> > >> > >> > >>> > >>> I then created a new xfs file system and mounted it on '/edge'; > >>> > >>> Ishtar:/edge> xfs_info . > >>> meta-data=/dev/Data/Edge isize=256 agcount=32, > >>> agsize=16777200 blks = sectsz=4096 attr=2 > >>> data = bsize=4096 blocks=536870400, imaxpct=5 > >>> = sunit=16 swidth=64 blks > >>> naming =version 2 bsize=4096 ascii-ci=0 > >>> log =internal bsize=4096 blocks=262143, version=2 > >>> = sectsz=4096 sunit=1 blks, lazy-count=1 > >>> realtime =none extsz=4096 blocks=0, rtextents=0 > >>> > >>> > >> > >> > >> Your problem is that you do not have "ftype" feature in directory > >> name format, like this: > >> > >> naming =version 2 bsize=4096 ascii-ci=0 ftype=1 > >> > >> Perhaps you have an old version of mkfs.xfs, not sure when > >> ftype=1 became the default format, but you can try to > >> mkfs.xfs -n ftype=1 > >> > > > > ---- Ah... no .. last I was told, if you turned on ftype=1, > > you had to also pull in crc'ing of all the meta-info. > > That has problems -- causes errors where there would be no > > problem, and was never tested on mature file systems that were > > already fragmented. > > > > > > Do you know if it was separated from crc32 -- for some inexplicable reason, > > if you wanted ftype, then the crc option would be forced on for you. Are you still getting all worked up about how metadata CRCs and the v5 on-disk format is going to make the sky fall, Linda? It's time to give in and come join us on the dark side... > I don't know if there was a specific reason, but that's the way it is. ftype was implemented as part of the format changes for the v5 format so it's always enabled for v5 filesystems. It was introduced as a mkfs option for the v4 format in early 2014, and since mid-2015 it's been the default for non-crc filesystems: # mkfs.xfs -f -m crc=0 /dev/vdb ..... naming =version 2 bsize=4096 ascii-ci=0 ftype=1 ..... Users should try to keep your userspace tools up to date with the kernel being run.... :) > > I didn't want it as I didn't want it to flag errors in metadata that > > wasn't crucial and didn't want the speed slowdown. Sigh. > > > > The problem on crc'ing the meta data, is that there is ALOT more meta > > data where detecting it will do more harm than good (like what nanosecond > > the file was last changed, for example). I first ran into it > > taking the disk offline when I changed the guid on a newly formatted disk. > > That was fixed, but that was a warning shot... How annoying. > > > > I have never heard about those issues that you raise. > It sounds like a myth about XFS metadata CRC that should be debunked > so forwarding your message on to XFS list. FYI, Amir. Keep in mind that a lot of people didn't like the concept of metadata CRCs in XFS because .... reasons. There has been a history of people jumping on bugs and/or not-yet-implemented feature as justification for their opposition to the change. Call it the nature of the vocal minority - most users haven't noticed and don't care that their new install of their distro of choice is now using CRC enabled filesystems by default.... As to the issue that Linda raised, yes, it *did* exist. We baked the UUID into the metadata format so we knew what filesystem owns a specific metadata block. Handy for detecting stale metadata on a reused device as well as misdirected writes. We knew about it from the start (all the tools had to be modified to disallow changing UUIDS on v5 filesystems!) but it just wasn't an important enough requirement to have this functionality up front for CRC enabled filesystems. However, it wasn't clear what the solution was to the "change UUID" problem when CRCs were ready, and we also needed to understand the behaviour of cloned v5 filesystems on COW based snapshots before we made any sort of change that could require rewriting all the metadata in the filesystem. So it took some time for the issue to come to the top of the "remaining problems to solve" and when it did we had already built up enough knowledge about v5 filesystem behaviour to determine the best way to solve the problem. IOWs, it was always the plan to support it so that tools like xfs_copy worked properly with v5 filesystems, but it wasn't a primary concern compared to making CRCs robust. It was fixed a couple of years ago: commit 9c4e12fb60c15dc9c5e54041c9679454b42cb23e Author: Eric Sandeen <sandeen@sandeen.net> Date: Mon Aug 3 10:45:00 2015 +1000 xfsprogs: Add new sb_meta_uuid field, update userspace tools to manipulate it This adds a new superblock field, sb_meta_uuid. This allows us to change the use-visible UUID on crc-enabled filesytems from userspace if desired, by copying the existing UUID to the new location for metadata comparisons. If this is done, an incompat flag must be set to prevent older filesystems from mounting the filesystem, but the original UUID can be restored, and the incompat flag removed, with a new xfs_db / xfs_admin UUID command, "restore." Much of this patch mirrors the kernel patch in simply renaming the field used for metadata uuid comparison; other bits: * Teach xfs_db to print the new meta_uuid field * Allow xfs_db to generate a new UUID for CRC-enabled filesystems * Allow xfs_db to revert to the original UUID and clear the flag * Fix up xfs_copy to work with CRC-enabled filesystems * Update the xfs_admin manpage to show the UUID "restore" command Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> > See also https://www.spinics.net/lists/xfs/msg19079.html Yeah, that was in reaction to the loud claims that "CRCs are going to slow everything down". Late last year we significantly reduced the CPU overhead of CRC calculation on the write side , so it drops off the CPU profiles in the workloads described in that like above almost entirely. This was the commit: commit cae028df53449905c944603df624ac94bc619661 Author: Dave Chinner <dchinner@redhat.com> Date: Mon Dec 5 14:40:32 2016 +1100 xfs: optimise CRC updates Nick Piggin reported that the CRC overhead in an fsync heavy workload was higher than expected on a Power8 machine. Part of this was to do with the fact that the power8 CRC implementation is not efficient for CRC lengths of less than 512 bytes, and so the way we split the CRCs over the CRC field means a lot of the CRCs are reduced to being less than than optimal size. To optimise this, change the CRC update mechanism to zero the CRC field first, and then compute the CRC in one pass over the buffer and write the result back into the buffer. We can do this safely because anything writing a CRC has exclusive access to the buffer the CRC is being calculated over. We leave the CRC verify code the same - it still splits the CRC calculation - because we do not want read-only operations modifying the underlying buffer. This is because read-only operations may not have an exclusive access to the buffer guaranteed, and so temporary modifications could leak out to to other processes accessing the buffer concurrently. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com> Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug? or normal behavior? if bug, then where? overlay, vfs, xfs, or ???? 2017-11-05 22:34 ` Dave Chinner @ 2017-11-08 21:21 ` L A Walsh 2017-11-09 1:47 ` Dave Chinner 0 siblings, 1 reply; 10+ messages in thread From: L A Walsh @ 2017-11-08 21:21 UTC (permalink / raw) To: Dave Chinner; +Cc: Amir Goldstein, overlayfs, linux-xfs, Darrick J. Wong Dave Chinner wrote: > Are you still getting all worked up about how metadata CRCs and > the v5 on-disk format is going to make the sky fall, Linda? It's > time to give in and come join us on the dark side... > --- I don't believe I've heard that the sky would fall. I only had 2 issues -- 1 that metadata that I that I didn't care about or that I wanted to change would be crc'd and prevent changing meta data I wanted to change or would flag errors in meta data I didn't care about (file last access time being a nanosecond or a day off due to bit rot and crc flagging it as an error. Maybe you might remember, I first ran into this when, as part of my mkfs procedure, I assigned my own value to my disk's UUID, and at the time, the crc-feature claimed the disk had a fault in it. My second issue was it being tied to the finobt feature in a way that precluded benchmarking changes on our own filesystems and workload. > >> I don't know if there was a specific reason, but that's the way it is. >> > > ftype was implemented as part of the format changes for the v5 > format so it's always enabled for v5 filesystems. It was introduced > as a mkfs option for the v4 format in early 2014, and since mid-2015 > it's been the default for non-crc filesystems: > > # mkfs.xfs -f -m crc=0 /dev/vdb > ..... > naming =version 2 bsize=4096 ascii-ci=0 ftype=1 > ..... > > Users should try to keep your userspace tools up to date with the > kernel being run.... :) > ---- And tools space writers should remember that those who run some distro may have tools from 2+ years old and have even been told that we are running unsupported configurations if we update system tools (not that this always stops some people). I forget -- what switch do I pass to xfs utils to have them tell me what features are supported (v4 or v5, for example)? I do see ftype=0|1 under naming and that it has nothing to do with crc of data as well as crc and finobt under metadata. The problem I had was following the kernel docs for the overlayfs and not seeing where ftype=1 was required when making an xfs file system. It seems like my mkfs supports ftype, but it isn't the default and I didn't know I was supposed to turn it on. > >> I have never heard about those issues that you raise. >> It sounds like a myth about XFS metadata CRC that should be debunked >> so forwarding your message on to XFS list. >> > > FYI, Amir. > > Keep in mind that a lot of people didn't like the concept of > metadata CRCs in XFS because .... reasons. --- See above for for my reasons. > As to the issue that Linda raised, yes, it *did* exist. We baked > the UUID into the metadata format so we knew what filesystem owns a > specific metadata block. Handy for detecting stale metadata on a > reused device as well as misdirected writes. We knew about it from > the start (all the tools had to be modified to disallow changing > UUIDS on v5 filesystems!) but it just wasn't an important enough > requirement to have this functionality up front for CRC enabled > filesystems. > ==== And you have confirmed 1 of my 2 reasons for disliking the crc feature -- it sounds like you can no longer set the UUID field on a new file systems. Please don't tell people that they sky is falling when you have broken the ability to change UUID's as was present in the past. That was a valid feature -- that I was told would be excluded from crc'ing, but now find that it can't be done without damaging ability for old systems to read such file systems. > > Yeah, that was in reaction to the loud claims that "CRCs are going > to slow everything down". Late last year we significantly reduced > the CPU overhead of CRC calculation on the write side , so it drops > off the CPU profiles in the workloads described in that like above > almost entirely. This was the commit: > ---- That article had nothing to do w/my concern and predated my involvement. My concern was tying the finobt feature to the crc feature so they could not be tested in isolation to allow seeing what the impact of crc's might be, but more importantly, seeing if finobt had any positive impact on more mature file systems without including the crc feature. Your stance seems to be that the the crc feature combined with the finobt feature don't show a measurable slowdown on newly created file systems. I would expect that, especially since finobt would benefit more mature file systems more than newer ones. While on newer file systems, finobt+crc comes out to about the same performance. My issue was the inability to bench or use them separately. No sky falling, just standard benchmark methodology to test changes on your own workload. But as to the ftype flag -- that was me using v4 tools and seeing no information that I needed to explicitly specify it to make the overlay file system work with xfs, which I don't think has anything to do with crc's. Right? -linda ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug? or normal behavior? if bug, then where? overlay, vfs, xfs, or ???? 2017-11-08 21:21 ` L A Walsh @ 2017-11-09 1:47 ` Dave Chinner 2017-11-09 7:51 ` L A Walsh 0 siblings, 1 reply; 10+ messages in thread From: Dave Chinner @ 2017-11-09 1:47 UTC (permalink / raw) To: L A Walsh; +Cc: Amir Goldstein, overlayfs, linux-xfs, Darrick J. Wong On Wed, Nov 08, 2017 at 01:21:18PM -0800, L A Walsh wrote: > Dave Chinner wrote: > >Are you still getting all worked up about how metadata CRCs and > >the v5 on-disk format is going to make the sky fall, Linda? It's > >time to give in and come join us on the dark side... > --- > I don't believe I've heard that the sky would fall. I only had > 2 issues -- 1 that metadata that I that I didn't care about or that I > wanted to change would be crc'd and prevent changing meta data I wanted > to change or would flag errors in meta data I didn't care about > (file last access time being a nanosecond or a day off due to bit rot > and crc flagging it as an error. > > Maybe you might remember, I first ran into this when, as part of > my mkfs procedure, I assigned my own value to my disk's UUID, and at the > time, the crc-feature claimed the disk had a fault in it. Yes, but changing the UUID was documented as "not currently supported" on v5 filesystems *when it was originally released*. IOWs, it was documented as "will be supported in future", but it wasn't a critical feature for the initial release of CRC enabled filesystems. If someone manually changed the UUID (which was the only way to do it because the xfs_db commands would refuse to do it) then *it broke the filesystem* and so it was correct behaviour to report corruption. Changing the UUID on v5 filesystems is now implemented and supported: $ sudo mkfs.xfs -f /dev/pmem0 Default configuration sourced from package build definitions meta-data=/dev/pmem0 isize=512 agcount=4, agsize=524288 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=0, rmapbt=0, reflink=0 data = bsize=4096 blocks=2097152, imaxpct=25, thinblocks=0 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 $ $ sudo blkid -c /dev/null /dev/pmem0 /dev/pmem0: UUID="7073fe11-4b44-4160-a8a0-dec492f61a14" TYPE="xfs" $ $ sudo xfs_admin -U generate /dev/pmem0 Clearing log and setting UUID writing all SBs new UUID = c3a4f999-b76a-4597-bb62-df11c5e3fc04 $ $ sudo blkid -c /dev/null /dev/pmem0 /dev/pmem0: UUID="c3a4f999-b76a-4597-bb62-df11c5e3fc04" TYPE="xfs" $ IOWs, this problem is ancient history. Move on, nothing to see here. > My second issue was it being tied to the finobt feature in a way that > precluded benchmarking changes on our own filesystems and workload. [....] > I would expect that, especially since finobt would benefit more mature > file systems more than newer ones. While on newer file systems, finobt+crc > comes out to about the same performance. > > My issue was the inability to bench or use them separately. <sigh> Not an XFS problem: $ mkfs.xfs -f -m finobt=0 /dev/pmem0 .... = crc=1 finobt=0, sparse=0, rmapbt=0, reflink=0 ..... Yup, crc's enabled, finobt is not. As documented in the mkfs.xfs man page. IOWs, We can directly measure the impact of the finobt on workloads/benchamrks. And if we want to compare the impact of CRCs, then 'mkfs.xfs -f -isize=512, -m crc=0 <dev>' will be directly comparable to the above non-finobt filesystem. THis is how we benchmarked the changes in the first place.... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug? or normal behavior? if bug, then where? overlay, vfs, xfs, or ???? 2017-11-09 1:47 ` Dave Chinner @ 2017-11-09 7:51 ` L A Walsh 0 siblings, 0 replies; 10+ messages in thread From: L A Walsh @ 2017-11-09 7:51 UTC (permalink / raw) To: Dave Chinner; +Cc: Amir Goldstein, overlayfs, linux-xfs, Darrick J. Wong Dave Chinner wrote: > > > Changing the UUID on v5 filesystems is now implemented and > supported: > --- And it won't have a problem if seen by a previous gen tool -- like xfsrestore on an older "emergency boot disk"? If not, then I misunderstood what you wrote earlier. >> My issue was the inability to bench or use them separately. >> > > <sigh> > > Not an XFS problem: > > $ mkfs.xfs -f -m finobt=0 /dev/pmem0 > .... > = crc=1 finobt=0, sparse=0, rmapbt=0, reflink=0 > ..... > > Yup, crc's enabled, finobt is not. As documented in the mkfs.xfs manpage. > So you are saying I can set finobt to 0 or 1 with crc=0? Because testing sigh crc=1 and finobt=0|1 isn't the same as testing crc=0 and finobt=0|1. I'm more interested in testing finobt's affect by itself and have a secondary interest in the effect of the crc option because I would like to use the finobt option (thus desire to test it first), but do not currently want to use the crc otpion, thus it being of secondary interest. Again, if I can test finobt 0 or 1 with no requirements I turn on crc, then I was mistaken in my earlier understanding. So I still have 2 issues: UUID labels that don't preclude using older emergency boot disks, to restore a file system, and the ability to test finobt apart from other features. > ge. > > IOWs, We can directly measure the impact of the finobt on > workloads/benchamrks. And if we want to compare the impact of CRCs, > then 'mkfs.xfs -f -isize=512, -m crc=0 <dev>' will be directly > comparable to the above non-finobt filesystem. THis is how we > benchmarked the changes in the first place.... > ---- That methodology is flawed. If the crc option is on during finobt being tested as 0 or 1, the crc option on means different disk caching -- if the crc option pulls in some or all of the finobt info, then you can't measure the finobt option with the crc option turned on. Even if you turn the crc feature off, if the disk has crc features, those also change how information is read in. Only if you create a disk with no crc info on it, and then can test finobt=0|1, can you see the relative performance of the finobt cases. If it is the case that finobt can be toggled on a disk with no crc data or option in use, then I've misunderstood previously read constraints (or they've changed). -l ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-11-09 7:51 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-11-01 22:03 Bug? or normal behavior? if bug, then where? overlay, vfs, xfs, or ???? L A Walsh 2017-11-02 7:03 ` Amir Goldstein 2017-11-02 21:57 ` L A Walsh 2017-11-03 6:45 ` Amir Goldstein 2017-11-05 8:17 ` L A Walsh 2017-11-05 8:55 ` Amir Goldstein 2017-11-05 22:34 ` Dave Chinner 2017-11-08 21:21 ` L A Walsh 2017-11-09 1:47 ` Dave Chinner 2017-11-09 7:51 ` L A Walsh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).