* Re: [Bug-tar] --sparse is broken on filesystems where small files may have zero blocks [not found] ` <10978717.9ApR7zKiv0@nb.usersys.redhat.com> @ 2013-10-29 15:27 ` Pavel Raiskup 2013-10-29 15:32 ` Joerg Schilling ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Pavel Raiskup @ 2013-10-29 15:27 UTC (permalink / raw) To: bug-tar; +Cc: Andrew J. Schorr, Paul Eggert, linux-fsdevel, Andreas Dilger On Tuesday, October 29, 2013 09:59:56 Pavel Raiskup wrote: > > #define ST_IS_SPARSE(st) \ > > (ST_NBLOCKS (st) \ > > - < ((st).st_size / ST_NBLOCKSIZE + ((st).st_size % ST_NBLOCKSIZE != 0))) > > + < ((st).st_size / ST_NBLOCKSIZE \ > > + + ((st).st_size % ST_NBLOCKSIZE != 0 \ > > + && (st).st_size / ST_NBLOCKSIZE != 0))) > > May the st.st_size / ST_NBLOCKSIZE be greater than 1 and data still stored > in inode directly? Seems like on ext4 filesystem it is not possible [1] > but does anybody know about exception? > > [1] https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Inline_Data Well, I now recalled somehow relevant Red Hat bug, sorry I have not mentioned it before: https://bugzilla.redhat.com/show_bug.cgi?id=757557 CC'ing fs-devel: The question is whether that ^^^^ is not a bug in filesystem — whether filesystem should not _always_ return to fstat() block count at least 1 if there are at least some data (even if these data are inlined in inode)? Just for catching the context, this thread starts here: http://lists.gnu.org/archive/html/bug-tar/2013-10/msg00030.html If that is not a bug in fs, is there possible to detect that particular file is completely sparse? Pavel -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: --sparse is broken on filesystems where small files may have zero blocks 2013-10-29 15:27 ` [Bug-tar] --sparse is broken on filesystems where small files may have zero blocks Pavel Raiskup @ 2013-10-29 15:32 ` Joerg Schilling 2013-10-29 17:12 ` [Bug-tar] " Christoph Hellwig 2013-10-29 17:37 ` Jan Kara 2 siblings, 0 replies; 6+ messages in thread From: Joerg Schilling @ 2013-10-29 15:32 UTC (permalink / raw) To: praiskup, bug-tar; +Cc: linux-fsdevel, aschorr, eggert, adilger Pavel Raiskup <praiskup@redhat.com> wrote: > Well, I now recalled somehow relevant Red Hat bug, sorry I have not > mentioned it before: > https://bugzilla.redhat.com/show_bug.cgi?id=757557 > > CC'ing fs-devel: The question is whether that ^^^^ is not a bug in > filesystem ??? whether filesystem should not _always_ return to fstat() > block count at least 1 if there are at least some data (even if these data > are inlined in inode)? Just for catching the context, this thread starts > here: http://lists.gnu.org/archive/html/bug-tar/2013-10/msg00030.html > > If that is not a bug in fs, is there possible to detect that particular > file is completely sparse? In September 2004 I defined the SEEK_HOLE/SEEK_DATA interface together with Jeff Bonwick. I believe that we should require that all OS that include filesystems that return nblocks == 0 with data in file need to implement SEEK_HOLE/SEEK_DATA. Jörg -- EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin js@cs.tu-berlin.de (uni) joerg.schilling@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug-tar] --sparse is broken on filesystems where small files may have zero blocks 2013-10-29 15:27 ` [Bug-tar] --sparse is broken on filesystems where small files may have zero blocks Pavel Raiskup 2013-10-29 15:32 ` Joerg Schilling @ 2013-10-29 17:12 ` Christoph Hellwig 2013-10-29 17:37 ` Jan Kara 2 siblings, 0 replies; 6+ messages in thread From: Christoph Hellwig @ 2013-10-29 17:12 UTC (permalink / raw) To: Pavel Raiskup Cc: bug-tar, Andrew J. Schorr, Paul Eggert, linux-fsdevel, Andreas Dilger On Tue, Oct 29, 2013 at 04:27:02PM +0100, Pavel Raiskup wrote: > CC'ing fs-devel: The question is whether that ^^^^ is not a bug in > filesystem ??? whether filesystem should not _always_ return to fstat() > block count at least 1 if there are at least some data (even if these data > are inlined in inode)? Just for catching the context, this thread starts > here: http://lists.gnu.org/archive/html/bug-tar/2013-10/msg00030.html > > If that is not a bug in fs, is there possible to detect that particular > file is completely sparse? For XFS we only support inline data for non-regular files at the moment, but for example an inode that has data stored just inline will report zero block, and the unfinished patches to support inline regular file data would logically do the same. As already pointed out by Joerg we do have a proper lseek interface to detect if there are holes. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug-tar] --sparse is broken on filesystems where small files may have zero blocks 2013-10-29 15:27 ` [Bug-tar] --sparse is broken on filesystems where small files may have zero blocks Pavel Raiskup 2013-10-29 15:32 ` Joerg Schilling 2013-10-29 17:12 ` [Bug-tar] " Christoph Hellwig @ 2013-10-29 17:37 ` Jan Kara 2013-10-29 22:01 ` Andreas Dilger 2 siblings, 1 reply; 6+ messages in thread From: Jan Kara @ 2013-10-29 17:37 UTC (permalink / raw) To: Pavel Raiskup Cc: bug-tar, Andrew J. Schorr, Paul Eggert, linux-fsdevel, Andreas Dilger On Tue 29-10-13 16:27:02, Pavel Raiskup wrote: > On Tuesday, October 29, 2013 09:59:56 Pavel Raiskup wrote: > > > #define ST_IS_SPARSE(st) \ > > > (ST_NBLOCKS (st) \ > > > - < ((st).st_size / ST_NBLOCKSIZE + ((st).st_size % ST_NBLOCKSIZE != 0))) > > > + < ((st).st_size / ST_NBLOCKSIZE \ > > > + + ((st).st_size % ST_NBLOCKSIZE != 0 \ > > > + && (st).st_size / ST_NBLOCKSIZE != 0))) > > > > May the st.st_size / ST_NBLOCKSIZE be greater than 1 and data still stored > > in inode directly? Seems like on ext4 filesystem it is not possible [1] > > but does anybody know about exception? > > > > [1] https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Inline_Data > > Well, I now recalled somehow relevant Red Hat bug, sorry I have not > mentioned it before: > https://bugzilla.redhat.com/show_bug.cgi?id=757557 > > CC'ing fs-devel: The question is whether that ^^^^ is not a bug in > filesystem — whether filesystem should not _always_ return to fstat() > block count at least 1 if there are at least some data (even if these data > are inlined in inode)? Just for catching the context, this thread starts > here: http://lists.gnu.org/archive/html/bug-tar/2013-10/msg00030.html So 'st_blocks' should be "the number of blocks allocated to the file, 512-byte units". If we are able to store the whole file within inode, we have no blocks allocated and thus setting st_blocks to 0 looks as a decent thing to do. Looking into filesystems where this is possible (ext4, btrfs, reiserfs) they will all set st_blocks to 0 if the file body is inlined and the size is smaller than 512 bytes. But OTOH btrfs and reiserfs *do* in fact account the amount of space consumed by the file. It is just that stat(2) syscall reports the space in 512-byte units and for historical reasons we ended up just truncating the byte-precision space counter instead of rounding it up (that is a mistake I made like 10 years ago :(). It is easy enough to start reporting the value rounded up but I'm not sure if it won't break some userspace which already developped some dependency on this buggy kernel behavior. ext4 is yet a different matter. It does really report the number of allocated blocks in st_blocks so it will report 0 while data can fit into the inode (whose size is configurable during fs creation, default is 256). In practice that will result in reporting non-zero st_blocks for 512-byte and larger files anyways so there won't be an observable difference between what ext4 and btrfs / reiserfs do. But we might still want to fix up ext4 to be consistent with btrfs and reiserfs so that things are more future-proof. > If that is not a bug in fs, is there possible to detect that particular > file is completely sparse? As Joerg wrote, SEEK_DATA / SEEK_HOLE is a proper interface for this at least for systems that support it. Once you have called stat(2) on the file, inode will be in cache anyways so the additional cost of open(2), lseek(2), close(2) won't be that big. For systems that don't support SEEK_DATA / SEEK_HOLE, you can use some heuristic like: if (st.st_size < st.st_blksize || st.st_blocks > 0) /* Bite the bullet and scan the data for non-zero bytes */ else /* Assume the file is sparse */ Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug-tar] --sparse is broken on filesystems where small files may have zero blocks 2013-10-29 17:37 ` Jan Kara @ 2013-10-29 22:01 ` Andreas Dilger 2013-10-31 12:11 ` Carlos Maiolino 0 siblings, 1 reply; 6+ messages in thread From: Andreas Dilger @ 2013-10-29 22:01 UTC (permalink / raw) To: Jan Kara Cc: Pavel Raiskup, bug-tar, Andrew J. Schorr, Paul Eggert, linux-fsdevel@vger.kernel.org Devel On Oct 29, 2013, at 11:37 AM, Jan Kara <jack@suse.cz> wrote: > On Tue 29-10-13 16:27:02, Pavel Raiskup wrote: >> >> Well, I now recalled somehow relevant Red Hat bug, sorry I have not >> mentioned it before: >> https://bugzilla.redhat.com/show_bug.cgi?id=757557 >> >> CC'ing fs-devel: The question is whether that ^^^^ is not a bug in >> filesystem — whether filesystem should not _always_ return to fstat() >> block count at least 1 if there are at least some data (even if these data >> are inlined in inode)? Just for catching the context, this thread starts >> here: http://lists.gnu.org/archive/html/bug-tar/2013-10/msg00030.html > > So 'st_blocks' should be "the number of blocks allocated to the file, > 512-byte units". If we are able to store the whole file within inode, we > have no blocks allocated and thus setting st_blocks to 0 looks as a decent > thing to do. Looking into filesystems where this is possible (ext4, btrfs, > reiserfs) they will all set st_blocks to 0 if the file body is inlined and > the size is smaller than 512 bytes. One could consider that any inode with inline data still needs to have the inode allocated to hold the data. > ext4 is yet a different matter. It does really report the number of > allocated blocks in st_blocks so it will report 0 while data can fit into > the inode (whose size is configurable during fs creation, default is 256). > In practice that will result in reporting non-zero st_blocks for 512-byte > and larger files anyways so there won't be an observable difference between > what ext4 and btrfs / reiserfs do. But we might still want to fix up ext4 > to be consistent with btrfs and reiserfs so that things are more > future-proof. Given that tar and rsync DO exist that silently drop user data for files with st_blocks == 0 (we have seen this in with Lustre tests) it makes sense to fix the ext4_getattr() to set st_blocks = 1 for files with inline data. It is already doing something similar for files with delalloc blocks to work around the same problem. >> If that is not a bug in fs, is there possible to detect that particular >> file is completely sparse? > As Joerg wrote, SEEK_DATA / SEEK_HOLE is a proper interface for this at > least for systems that support it. Once you have called stat(2) on the > file, inode will be in cache anyways so the additional cost of open(2), > lseek(2), close(2) won't be that big. For systems that don't support > SEEK_DATA / SEEK_HOLE, you can use some heuristic like: > if (st.st_size < st.st_blksize || st.st_blocks > 0) > /* Bite the bullet and scan the data for non-zero bytes */ > else > /* Assume the file is sparse */ I don’t see how this is better than the existing heuristic: if (st.st_blocks > 0) /* file has data */ else /* file is sparse */ that tools/applications are already using today. It isn’t possible to retroactively fix tar, rsync, etc. to use SEEK_DATA/SEEK_HOLE, but in the ext4 case the inline data feature is very new and it makes sense to fix this in the kernel. Of course it ALSO makes sense to fix this in userspace to handle the (st.st_size < st.st_blksize) case (this is not a lot of overhead) so that the chance of losing data in a variety of kernel/userspace combinations is minimized. Cheers, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug-tar] --sparse is broken on filesystems where small files may have zero blocks 2013-10-29 22:01 ` Andreas Dilger @ 2013-10-31 12:11 ` Carlos Maiolino 0 siblings, 0 replies; 6+ messages in thread From: Carlos Maiolino @ 2013-10-31 12:11 UTC (permalink / raw) To: Andreas Dilger Cc: Jan Kara, Pavel Raiskup, bug-tar, Andrew J. Schorr, Paul Eggert, linux-fsdevel@vger.kernel.org Devel On Tue, Oct 29, 2013 at 04:01:16PM -0600, Andreas Dilger wrote: > > On Oct 29, 2013, at 11:37 AM, Jan Kara <jack@suse.cz> wrote: > > > On Tue 29-10-13 16:27:02, Pavel Raiskup wrote: > >> > >> Well, I now recalled somehow relevant Red Hat bug, sorry I have not > >> mentioned it before: > >> https://bugzilla.redhat.com/show_bug.cgi?id=757557 > >> > >> CC'ing fs-devel: The question is whether that ^^^^ is not a bug in > >> filesystem — whether filesystem should not _always_ return to fstat() > >> block count at least 1 if there are at least some data (even if these data > >> are inlined in inode)? Just for catching the context, this thread starts > >> here: http://lists.gnu.org/archive/html/bug-tar/2013-10/msg00030.html > > > > So 'st_blocks' should be "the number of blocks allocated to the file, > > 512-byte units". If we are able to store the whole file within inode, we > > have no blocks allocated and thus setting st_blocks to 0 looks as a decent > > thing to do. Looking into filesystems where this is possible (ext4, btrfs, > > reiserfs) they will all set st_blocks to 0 if the file body is inlined and > > the size is smaller than 512 bytes. > > One could consider that any inode with inline data still needs to have > the inode allocated to hold the data. > > > ext4 is yet a different matter. It does really report the number of > > allocated blocks in st_blocks so it will report 0 while data can fit into > > the inode (whose size is configurable during fs creation, default is 256). > > In practice that will result in reporting non-zero st_blocks for 512-byte > > and larger files anyways so there won't be an observable difference between > > what ext4 and btrfs / reiserfs do. But we might still want to fix up ext4 > > to be consistent with btrfs and reiserfs so that things are more > > future-proof. > > Given that tar and rsync DO exist that silently drop user data for > files with st_blocks == 0 (we have seen this in with Lustre tests) > it makes sense to fix the ext4_getattr() to set st_blocks = 1 for > files with inline data. It is already doing something similar for > files with delalloc blocks to work around the same problem. Blocks used to store inode metadata are not considered data blocks, so, these are not reported. In case of delalloc, afaik, it's doing it to reserve space and in case it's not used, well, it's removed from the counter. But, it differs from the inline case since you'll not be using the reserved delalloc blocks (unless the file is extended, of course). I'm not pretty sure about this, but if you account the space used by inline data as a 'block', you might be accounting the same block twice, one as metadata block, and another as data block. Also, the idea of inline data is exactly to avoid block allocation for very small files that fit into inode's literal area, setting st_blocks to 1 would go against this purpose. And also, free-space count might be wrong setting it to 1 too. > > >> If that is not a bug in fs, is there possible to detect that particular > >> file is completely sparse? > > As Joerg wrote, SEEK_DATA / SEEK_HOLE is a proper interface for this at > > least for systems that support it. Once you have called stat(2) on the > > file, inode will be in cache anyways so the additional cost of open(2), > > lseek(2), close(2) won't be that big. For systems that don't support > > SEEK_DATA / SEEK_HOLE, you can use some heuristic like: > > if (st.st_size < st.st_blksize || st.st_blocks > 0) > > /* Bite the bullet and scan the data for non-zero bytes */ > > else > > /* Assume the file is sparse */ > > I don’t see how this is better than the existing heuristic: > > if (st.st_blocks > 0) > /* file has data */ > else > /* file is sparse */ > > that tools/applications are already using today. It isn’t possible > to retroactively fix tar, rsync, etc. to use SEEK_DATA/SEEK_HOLE, but > in the ext4 case the inline data feature is very new and it makes > sense to fix this in the kernel. Of course it ALSO makes sense to > fix this in userspace to handle the (st.st_size < st.st_blksize) > case (this is not a lot of overhead) so that the chance of losing > data in a variety of kernel/userspace combinations is minimized. > > Cheers, Andreas > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Carlos -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-10-31 12:11 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <20131028211739.GA26741@ti119.telemetry-investments.com> [not found] ` <526F376E.5030307@cs.ucla.edu> [not found] ` <10978717.9ApR7zKiv0@nb.usersys.redhat.com> 2013-10-29 15:27 ` [Bug-tar] --sparse is broken on filesystems where small files may have zero blocks Pavel Raiskup 2013-10-29 15:32 ` Joerg Schilling 2013-10-29 17:12 ` [Bug-tar] " Christoph Hellwig 2013-10-29 17:37 ` Jan Kara 2013-10-29 22:01 ` Andreas Dilger 2013-10-31 12:11 ` Carlos Maiolino
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).