* xfsdump does not support reflink copied files properly @ 2023-11-02 12:42 Alexander Puchmayr 2023-11-02 16:39 ` Darrick J. Wong 0 siblings, 1 reply; 3+ messages in thread From: Alexander Puchmayr @ 2023-11-02 12:42 UTC (permalink / raw) To: linux-xfs Hi there, I just encountered a problem when trying to use xfsdump on a filesystem with lots of reflink copied vm disk images, yielding a dump file much larger than expected and which I also was unable to restore from (target disk full). I created a gentoo bug item under https://bugs.gentoo.org/916704 and I got advised to report it here as well. Copy from the bug report: sys-fs/xfsdump-3.1.12 seems to copy reflink copied files as ordinary files, resulting in a way too big dump file. Restoring from such a dump yields likely a out-of-diskspace condition. It may be used as a denial-of-service tool which can be used by an ordinary user once he/she is able to create lots of reflink copies of large files, breaking a backup system relying on xfsdump (one can easily create Petabytes or even Exabytes of useless data overloading every affordable existing backup system) and also making it hard to restore the data during a disaster recovery. How to reproduce: 1) Create a 1GB test filesystem for demonstration purpose and mount it, e.g. $ dd if=/dev/zero of=/var/tmp/testimage_xfs.raw bs=1M count=1k $ mkfs.xfs /var/tmp/testimage_xfs.raw $ mount /var/tmp/testimage_xfs.raw /mnt/tmp 2) Create a testfile with 256M $ dd if=/dev/urandom of=/mnt/tmp/testfile.dat bs=1k count=256k 3) reflink-copy this file multiple times $ cd /mnt/tmp $ cp --reflink testfile.dat testfile1.dat $ cp --reflink testfile.dat testfile2.dat $ cp --reflink testfile.dat testfile3.dat $ cp --reflink testfile.dat testfile4.dat $ cp --reflink testfile.dat testfile5.dat $ cp --reflink testfile.dat testfile6.dat 4) Verify: $ df -h . Filesystem Size Used Avail Use% Mounted on /dev/loop0 960M 295M 666M 31% /mnt/tmp $ ls -l total 1835008 -rw-r--r-- 1 root root 268435456 Nov 2 09:37 testfile.dat -rw-r--r-- 1 root root 268435456 Nov 2 09:38 testfile1.dat -rw-r--r-- 1 root root 268435456 Nov 2 09:38 testfile2.dat -rw-r--r-- 1 root root 268435456 Nov 2 09:38 testfile3.dat -rw-r--r-- 1 root root 268435456 Nov 2 09:38 testfile4.dat -rw-r--r-- 1 root root 268435456 Nov 2 09:38 testfile5.dat -rw-r--r-- 1 root root 268435456 Nov 2 09:38 testfile6.dat 5) Create a dump: $ xfsdump -p10 -L TEST -M TEST1 -l 0 -f /var/tmp/xfsdump /mnt/tmp xfsdump: using file dump (drive_simple) strategy xfsdump: version 3.1.12 (dump format 3.0) - type ^C for status and control xfsdump: WARNING: most recent level 0 dump was interrupted, but not resuming that dump since resume (-R) option not specified xfsdump: level 0 dump of poseidon.local:/mnt/tmp xfsdump: dump date: Thu Nov 2 09:40:46 2023 xfsdump: session id: a5b46d21-698d-47fb-a0a0-9974421f923c xfsdump: session label: "TEST" xfsdump: ino map phase 1: constructing initial dump list xfsdump: ino map phase 2: skipping (no pruning necessary) xfsdump: ino map phase 3: skipping (only one dump stream) xfsdump: ino map construction complete xfsdump: estimated dump size: 1879071232 bytes xfsdump: creating dump session media file 0 (media 0, file 0) xfsdump: dumping ino map xfsdump: dumping directories xfsdump: dumping non-directory files xfsdump: ending media file xfsdump: media file size 1879556704 bytes xfsdump: dump size (non-dir files) : 1879501192 bytes xfsdump: dump complete: 5 seconds elapsed xfsdump: Dump Summary: xfsdump: stream 0 /var/tmp/xfsdump OK (success) xfsdump: Dump Status: SUCCESS Compare: A filesystem of size 1GB produced a dump of almost 2GB. My expectation is about the used 295M plus a little overhead. 6) Try to restore: $ rm /mnt/tmp/* $ xfsrestore -f /var/tmp/xfsdump /mnt/tmp/ xfsrestore: using file dump (drive_simple) strategy xfsrestore: version 3.1.12 (dump format 3.0) - type ^C for status and control xfsrestore: searching media for dump xfsrestore: examining media file 0 xfsrestore: dump description: xfsrestore: hostname: poseidon.local xfsrestore: mount point: /mnt/tmp xfsrestore: volume: /dev/loop0 xfsrestore: session time: Thu Nov 2 09:40:46 2023 xfsrestore: level: 0 xfsrestore: session label: "TEST" xfsrestore: media label: "TEST1" xfsrestore: file system id: 919b4b3c-1d02-4f39-a42d-3fa1b01c1b2f xfsrestore: session id: a5b46d21-698d-47fb-a0a0-9974421f923c xfsrestore: media id: bc0946fb-e8aa-41bf-b856-03b6100a5e76 xfsrestore: using online session inventory xfsrestore: searching media for directory dump xfsrestore: reading directories xfsrestore: 1 directories and 7 entries processed xfsrestore: directory post-processing xfsrestore: restoring non-directory files xfsrestore: attempt to write 262144 bytes to testfile3.dat at offset 160161792 failed: No space left on device xfsrestore: attempt to write 249856 bytes to testfile3.dat at offset 167772160 failed: No space left on device xfsrestore: attempt to write 245760 bytes to testfile3.dat at offset 184549376 failed: No space left on device xfsrestore: attempt to write 241664 bytes to testfile3.dat at offset 201326592 failed: No space left on device xfsrestore: attempt to write 237568 bytes to testfile3.dat at offset 218103808 failed: No space left on device xfsrestore: attempt to write 233472 bytes to testfile3.dat at offset 234881024 failed: No space left on device xfsrestore: attempt to write 229376 bytes to testfile3.dat at offset 251658240 failed: No space left on device xfsrestore: attempt to write 262144 bytes to testfile3.dat at offset 266563584 failed: No space left on device xfsrestore: WARNING: open of testfile4.dat failed: No space left on device: discarding ino 135 xfsrestore: WARNING: open of testfile5.dat failed: No space left on device: discarding ino 136 xfsrestore: WARNING: open of testfile6.dat failed: No space left on device: discarding ino 137 xfsrestore: restore complete: 3 seconds elapsed xfsrestore: Restore Summary: xfsrestore: stream 0 /var/tmp/xfsdump OK (success) xfsrestore: Restore Status: SUCCESS $ RESTORE FAILED!! It is copying the files as ordinary files and not reflinked files! ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: xfsdump does not support reflink copied files properly 2023-11-02 12:42 xfsdump does not support reflink copied files properly Alexander Puchmayr @ 2023-11-02 16:39 ` Darrick J. Wong 2023-11-02 21:47 ` Dave Chinner 0 siblings, 1 reply; 3+ messages in thread From: Darrick J. Wong @ 2023-11-02 16:39 UTC (permalink / raw) To: Alexander Puchmayr; +Cc: linux-xfs On Thu, Nov 02, 2023 at 01:42:54PM +0100, Alexander Puchmayr wrote: > Hi there, > > I just encountered a problem when trying to use xfsdump on a filesystem with > lots of reflink copied vm disk images, yielding a dump file much larger than > expected and which I also was unable to restore from (target disk full). I > created a gentoo bug item under https://bugs.gentoo.org/916704 and I got > advised to report it here as well. > > Copy from the bug report: > > sys-fs/xfsdump-3.1.12 seems to copy reflink copied files as ordinary files, > resulting in a way too big dump file. Restoring from such a dump yields likely > a out-of-diskspace condition. Correct, xfsdump (and tar, and rsync...) does not know how to preserve the sharing factor of a particular space extent. All of those tools walk the inodes on a filesystem, open them, and read() out the data. Although there are ways to find out which file(s) own a piece of disk space, each of those tools would most likely require a thorough redesign to the dump file format to allow pointing to shared blocks elsewhere in the dump file. Or one could bolt a deduplicator onto the restore side, which wouldn't help with the dumpfile size explosion but would at least allow for compact restoration. Regardless, nobody's submitted code to do any of those things. Patches welcome. > It may be used as a denial-of-service tool which can be used by an ordinary Please do not file a ^^^^^^^^^^^^^^^^^ CVE for this. --D > user once he/she is able to create lots of reflink copies of large files, > breaking a backup system relying on xfsdump (one can easily create Petabytes > or even Exabytes of useless data overloading every affordable existing backup > system) and also making it hard to restore the data during a disaster > recovery. > > How to reproduce: > > 1) Create a 1GB test filesystem for demonstration purpose and mount it, e.g. > $ dd if=/dev/zero of=/var/tmp/testimage_xfs.raw bs=1M count=1k > $ mkfs.xfs /var/tmp/testimage_xfs.raw > $ mount /var/tmp/testimage_xfs.raw /mnt/tmp > > 2) Create a testfile with 256M > $ dd if=/dev/urandom of=/mnt/tmp/testfile.dat bs=1k count=256k > > 3) reflink-copy this file multiple times > $ cd /mnt/tmp > $ cp --reflink testfile.dat testfile1.dat > $ cp --reflink testfile.dat testfile2.dat > $ cp --reflink testfile.dat testfile3.dat > $ cp --reflink testfile.dat testfile4.dat > $ cp --reflink testfile.dat testfile5.dat > $ cp --reflink testfile.dat testfile6.dat > > 4) Verify: > $ df -h . > Filesystem Size Used Avail Use% Mounted on > /dev/loop0 960M 295M 666M 31% /mnt/tmp > > $ ls -l > total 1835008 > -rw-r--r-- 1 root root 268435456 Nov 2 09:37 testfile.dat > -rw-r--r-- 1 root root 268435456 Nov 2 09:38 testfile1.dat > -rw-r--r-- 1 root root 268435456 Nov 2 09:38 testfile2.dat > -rw-r--r-- 1 root root 268435456 Nov 2 09:38 testfile3.dat > -rw-r--r-- 1 root root 268435456 Nov 2 09:38 testfile4.dat > -rw-r--r-- 1 root root 268435456 Nov 2 09:38 testfile5.dat > -rw-r--r-- 1 root root 268435456 Nov 2 09:38 testfile6.dat > > 5) Create a dump: > $ xfsdump -p10 -L TEST -M TEST1 -l 0 -f /var/tmp/xfsdump /mnt/tmp > xfsdump: using file dump (drive_simple) strategy > xfsdump: version 3.1.12 (dump format 3.0) - type ^C for status and control > xfsdump: WARNING: most recent level 0 dump was interrupted, but not resuming > that dump since resume (-R) option not specified > xfsdump: level 0 dump of poseidon.local:/mnt/tmp > xfsdump: dump date: Thu Nov 2 09:40:46 2023 > xfsdump: session id: a5b46d21-698d-47fb-a0a0-9974421f923c > xfsdump: session label: "TEST" > xfsdump: ino map phase 1: constructing initial dump list > xfsdump: ino map phase 2: skipping (no pruning necessary) > xfsdump: ino map phase 3: skipping (only one dump stream) > xfsdump: ino map construction complete > xfsdump: estimated dump size: 1879071232 bytes > xfsdump: creating dump session media file 0 (media 0, file 0) > xfsdump: dumping ino map > xfsdump: dumping directories > xfsdump: dumping non-directory files > xfsdump: ending media file > xfsdump: media file size 1879556704 bytes > xfsdump: dump size (non-dir files) : 1879501192 bytes > xfsdump: dump complete: 5 seconds elapsed > xfsdump: Dump Summary: > xfsdump: stream 0 /var/tmp/xfsdump OK (success) > xfsdump: Dump Status: SUCCESS > > Compare: A filesystem of size 1GB produced a dump of almost 2GB. My expectation > is about the used 295M plus a little overhead. > > 6) Try to restore: > $ rm /mnt/tmp/* > $ xfsrestore -f /var/tmp/xfsdump /mnt/tmp/ > xfsrestore: using file dump (drive_simple) strategy > xfsrestore: version 3.1.12 (dump format 3.0) - type ^C for status and control > xfsrestore: searching media for dump > xfsrestore: examining media file 0 > xfsrestore: dump description: > xfsrestore: hostname: poseidon.local > xfsrestore: mount point: /mnt/tmp > xfsrestore: volume: /dev/loop0 > xfsrestore: session time: Thu Nov 2 09:40:46 2023 > xfsrestore: level: 0 > xfsrestore: session label: "TEST" > xfsrestore: media label: "TEST1" > xfsrestore: file system id: 919b4b3c-1d02-4f39-a42d-3fa1b01c1b2f > xfsrestore: session id: a5b46d21-698d-47fb-a0a0-9974421f923c > xfsrestore: media id: bc0946fb-e8aa-41bf-b856-03b6100a5e76 > xfsrestore: using online session inventory > xfsrestore: searching media for directory dump > xfsrestore: reading directories > xfsrestore: 1 directories and 7 entries processed > xfsrestore: directory post-processing > xfsrestore: restoring non-directory files > xfsrestore: attempt to write 262144 bytes to testfile3.dat at offset 160161792 > failed: No space left on device > xfsrestore: attempt to write 249856 bytes to testfile3.dat at offset 167772160 > failed: No space left on device > xfsrestore: attempt to write 245760 bytes to testfile3.dat at offset 184549376 > failed: No space left on device > xfsrestore: attempt to write 241664 bytes to testfile3.dat at offset 201326592 > failed: No space left on device > xfsrestore: attempt to write 237568 bytes to testfile3.dat at offset 218103808 > failed: No space left on device > xfsrestore: attempt to write 233472 bytes to testfile3.dat at offset 234881024 > failed: No space left on device > xfsrestore: attempt to write 229376 bytes to testfile3.dat at offset 251658240 > failed: No space left on device > xfsrestore: attempt to write 262144 bytes to testfile3.dat at offset 266563584 > failed: No space left on device > xfsrestore: WARNING: open of testfile4.dat failed: No space left on device: > discarding ino 135 > xfsrestore: WARNING: open of testfile5.dat failed: No space left on device: > discarding ino 136 > xfsrestore: WARNING: open of testfile6.dat failed: No space left on device: > discarding ino 137 > xfsrestore: restore complete: 3 seconds elapsed > xfsrestore: Restore Summary: > xfsrestore: stream 0 /var/tmp/xfsdump OK (success) > xfsrestore: Restore Status: SUCCESS > $ > > RESTORE FAILED!! It is copying the files as ordinary files and not reflinked > files! > > ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: xfsdump does not support reflink copied files properly 2023-11-02 16:39 ` Darrick J. Wong @ 2023-11-02 21:47 ` Dave Chinner 0 siblings, 0 replies; 3+ messages in thread From: Dave Chinner @ 2023-11-02 21:47 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Alexander Puchmayr, linux-xfs On Thu, Nov 02, 2023 at 09:39:53AM -0700, Darrick J. Wong wrote: > On Thu, Nov 02, 2023 at 01:42:54PM +0100, Alexander Puchmayr wrote: > > Hi there, > > > > I just encountered a problem when trying to use xfsdump on a filesystem with > > lots of reflink copied vm disk images, yielding a dump file much larger than > > expected and which I also was unable to restore from (target disk full). I > > created a gentoo bug item under https://bugs.gentoo.org/916704 and I got > > advised to report it here as well. > > > > Copy from the bug report: > > > > sys-fs/xfsdump-3.1.12 seems to copy reflink copied files as ordinary files, > > resulting in a way too big dump file. Restoring from such a dump yields likely > > a out-of-diskspace condition. > > Correct, xfsdump (and tar, and rsync...) does not know how to preserve > the sharing factor of a particular space extent. All of those tools > walk the inodes on a filesystem, open them, and read() out the data. > > Although there are ways to find out which file(s) own a piece of disk > space, each of those tools would most likely require a thorough redesign > to the dump file format to allow pointing to shared blocks elsewhere in > the dump file. I don't think that is the case. Like XFS, xfsdump encodes user data it backs up in extent records, and it has different types of extents. It currently understands "data" and "hole" extents as returned by XFS_IOC_GETBMAPX, so we could extend that to encode "shared" extents that point to an offset and length in a different inode. Yes, this means during the scan we have to record all shared extents with their underlying block number, then after the scan we need to resolve that to the single copy we are going to keep ias a normal data extent in the dump (i.e. the first to be restored) Then we convert all the others to the new shared extent type that points at the {ino, off, len} that contains the actual data in the dump. Now all restore needs to do is run FICLONERANGE when it comes across a shared extent - it's got all the info it needs in the dump to recreated the shared extent. We can use restore side ordering to guarantee that the data we need to clone is already on disk (e.g. delay extent clones until after all the normal data has been restored) so that all the shared extents we restore end up with the correct data in them. Yes, this means we need to bump the dump format version number to support shared extents, but overall it's not a major revision of the format or major surgery to the code base. It doesn't require kernel or even XFS expertise to implement - it's all userspace stuff and fairly straight forward - it just requires time, resources and commitment. > Regardless, nobody's submitted code to do any of those things. Patches > welcome. Yup, that is the biggest issue - there's always more things to do that we have people to do them. > > It may be used as a denial-of-service tool which can be used by an ordinary > > Please do not file a ^^^^^^^^^^^^^^^^^ CVE for this. /me sighs -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-11-02 21:47 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-11-02 12:42 xfsdump does not support reflink copied files properly Alexander Puchmayr 2023-11-02 16:39 ` Darrick J. Wong 2023-11-02 21:47 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox