* [BUG] copy file result with zero
@ 2011-10-01 14:01 Dave Young
2011-10-01 14:39 ` Ted Ts'o
0 siblings, 1 reply; 15+ messages in thread
From: Dave Young @ 2011-10-01 14:01 UTC (permalink / raw)
To: Linux Kernel Mailing List; +Cc: linux-ext4
Hi,
Weird problem, when I build app from source,
make; make install
run the command, but got "cannot execute binary file"
hexdump shows the installed binary is full of zero
Is it related to ext4 fiemap problem described below?
http://lwn.net/Articles/429349/
I finally managed to find the way to reproduce this:
just cp a elf binary A to file B, then cp B to file C, then you will get:
A == B != C
ie.
cp /bin/ls ls1
cp ls1 ls2
ls2 will be filled with zero
Below is a strace log of install, kernel version is 3.1.0-rc6+
geteuid() = 0
umask(0) = 022
stat("/tmp/vpnc", 0x7fff85363710) = -1 ENOENT (No such file or directory)
stat("vpnc", {st_mode=S_IFREG|0755, st_size=368662, ...}) = 0
lstat("/tmp/vpnc", 0x7fff85363250) = -1 ENOENT (No such file or directory)
open("vpnc", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0755, st_size=368662, ...}) = 0
open("/tmp/vpnc", O_WRONLY|O_CREAT|O_EXCL, 0755) = 4
fstat(4, {st_mode=S_IFREG|0755, st_size=0, ...}) = 0
uname({sys="Linux", node="darkstar", ...}) = 0
ioctl(3, FS_IOC_FIEMAP, 0x7fff85361f60) = 0
ftruncate(4, 368662) = 0
fsetxattr(4, "system.posix_acl_access",
"\x02\x00\x00\x00\x01\x00\x06\x00\xff\xff\xff\xff\x04\x00\x00\x00\xff\xff\xff\xff
\x00\x00\x00\xff\xff\xff\xff", 28, 0) = 0
close(4) = 0
close(3) = 0
chmod("/tmp/vpnc", 0755) = 0
close(0) = 0
close(1) = 0
close(2) = 0
exit_group(0) = ?
--
Regards
Dave
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [BUG] copy file result with zero 2011-10-01 14:01 [BUG] copy file result with zero Dave Young @ 2011-10-01 14:39 ` Ted Ts'o 2011-10-01 23:37 ` Dave Young 0 siblings, 1 reply; 15+ messages in thread From: Ted Ts'o @ 2011-10-01 14:39 UTC (permalink / raw) To: Dave Young; +Cc: Linux Kernel Mailing List, linux-ext4 On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote: > Hi, > > Weird problem, when I build app from source, > make; make install > run the command, but got "cannot execute binary file" > > hexdump shows the installed binary is full of zero > > Is it related to ext4 fiemap problem described below? > http://lwn.net/Articles/429349/ There is general agreement that /bin/cp should not have been relying on FIEMAP, and I believe the more recent versions of /bin/cp have removed that code by default pending implementation of SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its FIEMAP implementation that landed in 2.6.39, and you're using 3.1.0-rc6. > I finally managed to find the way to reproduce this: > just cp a elf binary A to file B, then cp B to file C, then you will get: > A == B != C > > ie. > cp /bin/ls ls1 > cp ls1 ls2 > > ls2 will be filled with zero If you add a "sync" between the two copies, does that work around the problem? I bet it will... My suggestion is to upgrade to a newer version of coreutils that doesn't try to use FIEMAP. - Ted ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] copy file result with zero 2011-10-01 14:39 ` Ted Ts'o @ 2011-10-01 23:37 ` Dave Young 2011-10-02 6:41 ` Jeff liu 0 siblings, 1 reply; 15+ messages in thread From: Dave Young @ 2011-10-01 23:37 UTC (permalink / raw) To: Ted Ts'o, Dave Young, Linux Kernel Mailing List, linux-ext4 On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <tytso@mit.edu> wrote: > On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote: >> Hi, >> >> Weird problem, when I build app from source, >> make; make install >> run the command, but got "cannot execute binary file" >> >> hexdump shows the installed binary is full of zero >> >> Is it related to ext4 fiemap problem described below? >> http://lwn.net/Articles/429349/ > > There is general agreement that /bin/cp should not have been relying > on FIEMAP, and I believe the more recent versions of /bin/cp have > removed that code by default pending implementation of > SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its > FIEMAP implementation that landed in 2.6.39, and you're using > 3.1.0-rc6. Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap? > >> I finally managed to find the way to reproduce this: >> just cp a elf binary A to file B, then cp B to file C, then you will get: >> A == B != C >> >> ie. >> cp /bin/ls ls1 >> cp ls1 ls2 >> >> ls2 will be filled with zero > > If you add a "sync" between the two copies, does that work around the > problem? I bet it will... Yes, it works > > My suggestion is to upgrade to a newer version of coreutils that > doesn't try to use FIEMAP. Thanks, will try > > - Ted > -- Regards Dave ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] copy file result with zero 2011-10-01 23:37 ` Dave Young @ 2011-10-02 6:41 ` Jeff liu 2011-10-02 7:59 ` Andreas Dilger 2011-10-02 8:02 ` Andreas Dilger 0 siblings, 2 replies; 15+ messages in thread From: Jeff liu @ 2011-10-02 6:41 UTC (permalink / raw) To: Dave Young; +Cc: Ted Ts'o, Linux Kernel Mailing List, linux-ext4 > On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <tytso@mit.edu> wrote: >> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote: >>> Hi, >>> >>> Weird problem, when I build app from source, >>> make; make install >>> run the command, but got "cannot execute binary file" >>> >>> hexdump shows the installed binary is full of zero >>> >>> Is it related to ext4 fiemap problem described below? >>> http://lwn.net/Articles/429349/ >> >> There is general agreement that /bin/cp should not have been relying >> on FIEMAP, and I believe the more recent versions of /bin/cp have >> removed that code by default pending implementation of >> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its >> FIEMAP implementation that landed in 2.6.39, and you're using >> 3.1.0-rc6. Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based. Thanks, -Jeff > > Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap? > >> >>> I finally managed to find the way to reproduce this: >>> just cp a elf binary A to file B, then cp B to file C, then you will get: >>> A == B != C >>> >>> ie. >>> cp /bin/ls ls1 >>> cp ls1 ls2 >>> >>> ls2 will be filled with zero >> >> If you add a "sync" between the two copies, does that work around the >> problem? I bet it will... > > Yes, it works > >> >> My suggestion is to upgrade to a newer version of coreutils that >> doesn't try to use FIEMAP. > > Thanks, will try > >> >> - Ted >> > > > > -- > Regards > Dave > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] copy file result with zero 2011-10-02 6:41 ` Jeff liu @ 2011-10-02 7:59 ` Andreas Dilger 2011-10-02 7:14 ` Jeff liu ` (2 more replies) 2011-10-02 8:02 ` Andreas Dilger 1 sibling, 3 replies; 15+ messages in thread From: Andreas Dilger @ 2011-10-02 7:59 UTC (permalink / raw) To: Jeff liu, Yongqiang Yang Cc: Dave Young, Ted Ts'o, Linux Kernel Mailing List, linux-ext4 On 2011-10-01, at 11:41 PM, Jeff liu wrote: >> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <tytso@mit.edu> wrote: >>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote: >>>> Hi, >>>> >>>> Weird problem, when I build app from source, >>>> make; make install >>>> run the command, but got "cannot execute binary file" >>>> >>>> hexdump shows the installed binary is full of zero >>>> >>>> Is it related to ext4 fiemap problem described below? >>>> http://lwn.net/Articles/429349/ >>> >>> There is general agreement that /bin/cp should not have been relying >>> on FIEMAP, and I believe the more recent versions of /bin/cp have >>> removed that code by default pending implementation of >>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its >>> FIEMAP implementation that landed in 2.6.39, and you're using >>> 3.1.0-rc6. > > Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based. My understanding is that cp uses the blocks count to determine whether the file is sparse or not. In the case of delayed allocation (where blocks are not yet allocated, if they are not reflected in the i_blocks count) it might mistakenly think that the file is sparse. Given the danger of this bug, it is important to ensure ext4 returns DELALLOC extents for pages in the page cache. I think Yongqiang Yang just submitted a patch series to do this for ext4, so it would be important to verify it fixes this problem. >> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap? >> >>> >>>> I finally managed to find the way to reproduce this: >>>> just cp a elf binary A to file B, then cp B to file C, then you will get: >>>> A == B != C >>>> >>>> ie. >>>> cp /bin/ls ls1 >>>> cp ls1 ls2 >>>> >>>> ls2 will be filled with zero >>> >>> If you add a "sync" between the two copies, does that work around the >>> problem? I bet it will... >> >> Yes, it works >> >>> >>> My suggestion is to upgrade to a newer version of coreutils that >>> doesn't try to use FIEMAP. >> >> Thanks, will try >> >>> >>> - Ted >>> >> >> >> >> -- >> Regards >> Dave >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Cheers, Andreas ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] copy file result with zero 2011-10-02 7:59 ` Andreas Dilger @ 2011-10-02 7:14 ` Jeff liu 2011-10-02 8:46 ` Dave Young 2011-10-02 11:54 ` Christoph Hellwig 2011-10-03 9:26 ` Yongqiang Yang 2 siblings, 1 reply; 15+ messages in thread From: Jeff liu @ 2011-10-02 7:14 UTC (permalink / raw) To: Andreas Dilger Cc: Yongqiang Yang, Dave Young, Ted Ts'o, Linux Kernel Mailing List, linux-ext4 在 2011-10-2,下午3:59, Andreas Dilger 写道: > On 2011-10-01, at 11:41 PM, Jeff liu wrote: >>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <tytso@mit.edu> wrote: >>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote: >>>>> Hi, >>>>> >>>>> Weird problem, when I build app from source, >>>>> make; make install >>>>> run the command, but got "cannot execute binary file" >>>>> >>>>> hexdump shows the installed binary is full of zero >>>>> >>>>> Is it related to ext4 fiemap problem described below? >>>>> http://lwn.net/Articles/429349/ >>>> >>>> There is general agreement that /bin/cp should not have been relying >>>> on FIEMAP, and I believe the more recent versions of /bin/cp have >>>> removed that code by default pending implementation of >>>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its >>>> FIEMAP implementation that landed in 2.6.39, and you're using >>>> 3.1.0-rc6. >> >> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based. > > My understanding is that cp uses the blocks count to determine whether the file is sparse or not. Yes, it based on blocks count to determine that. > In the case of delayed allocation (where blocks are not yet allocated, if they are not reflected in the i_blocks count) it might mistakenly think that the file is sparse. Thanks for pointing this out, I missed this case. So for Dave's issue, even if he updated to the upstream Coreutils, this issue will still exists occasionally for delayed allocation, if not run sync in between times. > > Given the danger of this bug, it is important to ensure ext4 returns DELALLOC extents for pages in the page cache. I think Yongqiang Yang just submitted a patch series to do this for ext4, so it would be important to verify it fixes this problem. Thanks, -Jeff > >>> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap? >>> >>>> >>>>> I finally managed to find the way to reproduce this: >>>>> just cp a elf binary A to file B, then cp B to file C, then you will get: >>>>> A == B != C >>>>> >>>>> ie. >>>>> cp /bin/ls ls1 >>>>> cp ls1 ls2 >>>>> >>>>> ls2 will be filled with zero >>>> >>>> If you add a "sync" between the two copies, does that work around the >>>> problem? I bet it will... >>> >>> Yes, it works >>> >>>> >>>> My suggestion is to upgrade to a newer version of coreutils that >>>> doesn't try to use FIEMAP. >>> >>> Thanks, will try >>> >>>> >>>> - Ted >>>> >>> >>> >>> >>> -- >>> Regards >>> Dave >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > Cheers, Andreas > > > > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] copy file result with zero 2011-10-02 7:14 ` Jeff liu @ 2011-10-02 8:46 ` Dave Young 0 siblings, 0 replies; 15+ messages in thread From: Dave Young @ 2011-10-02 8:46 UTC (permalink / raw) To: Jeff liu Cc: Andreas Dilger, Yongqiang Yang, Ted Ts'o, Linux Kernel Mailing List, linux-ext4 2011/10/2 Jeff liu <jeff.liu@oracle.com>: > > 在 2011-10-2,下午3:59, Andreas Dilger 写道: > >> On 2011-10-01, at 11:41 PM, Jeff liu wrote: >>>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <tytso@mit.edu> wrote: >>>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote: >>>>>> Hi, >>>>>> >>>>>> Weird problem, when I build app from source, >>>>>> make; make install >>>>>> run the command, but got "cannot execute binary file" >>>>>> >>>>>> hexdump shows the installed binary is full of zero >>>>>> >>>>>> Is it related to ext4 fiemap problem described below? >>>>>> http://lwn.net/Articles/429349/ >>>>> >>>>> There is general agreement that /bin/cp should not have been relying >>>>> on FIEMAP, and I believe the more recent versions of /bin/cp have >>>>> removed that code by default pending implementation of >>>>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its >>>>> FIEMAP implementation that landed in 2.6.39, and you're using >>>>> 3.1.0-rc6. >>> >>> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based. >> >> My understanding is that cp uses the blocks count to determine whether the file is sparse or not. > Yes, it based on blocks count to determine that. > >> In the case of delayed allocation (where blocks are not yet allocated, if they are not reflected in the i_blocks count) it might mistakenly think that the file is sparse. I think this might be my case > Thanks for pointing this out, I missed this case. > So for Dave's issue, even if he updated to the upstream Coreutils, this issue will still exists occasionally for delayed allocation, if not run sync in between times. Not occasionally, I can easily reproduce it recently. > >> >> Given the danger of this bug, it is important to ensure ext4 returns DELALLOC extents for pages in the page cache. I think Yongqiang Yang just submitted a patch series to do this for ext4, so it would be important to verify it fixes this problem. > > > Thanks, > -Jeff >> >>>> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap? >>>> >>>>> >>>>>> I finally managed to find the way to reproduce this: >>>>>> just cp a elf binary A to file B, then cp B to file C, then you will get: >>>>>> A == B != C >>>>>> >>>>>> ie. >>>>>> cp /bin/ls ls1 >>>>>> cp ls1 ls2 >>>>>> >>>>>> ls2 will be filled with zero >>>>> >>>>> If you add a "sync" between the two copies, does that work around the >>>>> problem? I bet it will... >>>> >>>> Yes, it works >>>> >>>>> >>>>> My suggestion is to upgrade to a newer version of coreutils that >>>>> doesn't try to use FIEMAP. >>>> >>>> Thanks, will try >>>> >>>>> >>>>> - Ted >>>>> >>>> >>>> >>>> >>>> -- >>>> Regards >>>> Dave >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> >> Cheers, Andreas >> >> >> >> >> > > -- Regards Dave ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] copy file result with zero 2011-10-02 7:59 ` Andreas Dilger 2011-10-02 7:14 ` Jeff liu @ 2011-10-02 11:54 ` Christoph Hellwig 2011-10-03 9:26 ` Yongqiang Yang 2 siblings, 0 replies; 15+ messages in thread From: Christoph Hellwig @ 2011-10-02 11:54 UTC (permalink / raw) To: Andreas Dilger Cc: Jeff liu, Yongqiang Yang, Dave Young, Ted Ts'o, Linux Kernel Mailing List, linux-ext4 On Sun, Oct 02, 2011 at 12:59:22AM -0700, Andreas Dilger wrote: > My understanding is that cp uses the blocks count to determine whether the file is sparse or not. In the case of delayed allocation (where blocks are not yet allocated, if they are not reflected in the i_blocks count) it might mistakenly think that the file is sparse. Ext4 fortunatley is smart enough to add the delalloc blocks to st_blocks for state, just like all other filesystems implementing delayed allocations. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] copy file result with zero 2011-10-02 7:59 ` Andreas Dilger 2011-10-02 7:14 ` Jeff liu 2011-10-02 11:54 ` Christoph Hellwig @ 2011-10-03 9:26 ` Yongqiang Yang 2011-10-03 13:11 ` Lukas Czerner 2 siblings, 1 reply; 15+ messages in thread From: Yongqiang Yang @ 2011-10-03 9:26 UTC (permalink / raw) To: Andreas Dilger, Lukas Czerner Cc: Jeff liu, Dave Young, Ted Ts'o, Linux Kernel Mailing List, linux-ext4 On Sun, Oct 2, 2011 at 3:59 PM, Andreas Dilger <aedilger@gmail.com> wrote: > On 2011-10-01, at 11:41 PM, Jeff liu wrote: >>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <tytso@mit.edu> wrote: >>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote: >>>>> Hi, >>>>> >>>>> Weird problem, when I build app from source, >>>>> make; make install >>>>> run the command, but got "cannot execute binary file" >>>>> >>>>> hexdump shows the installed binary is full of zero >>>>> >>>>> Is it related to ext4 fiemap problem described below? >>>>> http://lwn.net/Articles/429349/ >>>> >>>> There is general agreement that /bin/cp should not have been relying >>>> on FIEMAP, and I believe the more recent versions of /bin/cp have >>>> removed that code by default pending implementation of >>>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its >>>> FIEMAP implementation that landed in 2.6.39, and you're using >>>> 3.1.0-rc6. >> >> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based. > > My understanding is that cp uses the blocks count to determine whether the file is sparse or not. In the case of delayed allocation (where blocks are not yet allocated, if they are not reflected in the i_blocks count) it might mistakenly think that the file is sparse. > > Given the danger of this bug, it is important to ensure ext4 returns DELALLOC extents for pages in the page cache. I think Yongqiang Yang just submitted a patch series to do this for ext4, so it would be important to verify it fixes this problem. It seemed the patch[ ext4: in fiemap use FIEMAP_EXTENT_LAST flag for last extent] (http://www.spinics.net/lists/linux-ext4/msg25698.html) Lukas submitted on FIEMAP which ignores delayed extents beyond the last allocated block. e.g. AAAHHHHDDDD A - allocated, H - hole, D - delayed alloc, then the ending delayed extent is ignored. Yongqiang. > >>> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap? >>> >>>> >>>>> I finally managed to find the way to reproduce this: >>>>> just cp a elf binary A to file B, then cp B to file C, then you will get: >>>>> A == B != C >>>>> >>>>> ie. >>>>> cp /bin/ls ls1 >>>>> cp ls1 ls2 >>>>> >>>>> ls2 will be filled with zero >>>> >>>> If you add a "sync" between the two copies, does that work around the >>>> problem? I bet it will... >>> >>> Yes, it works >>> >>>> >>>> My suggestion is to upgrade to a newer version of coreutils that >>>> doesn't try to use FIEMAP. >>> >>> Thanks, will try >>> >>>> >>>> - Ted >>>> >>> >>> >>> >>> -- >>> Regards >>> Dave >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > Cheers, Andreas > > > > > > -- Best Wishes Yongqiang Yang ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] copy file result with zero 2011-10-03 9:26 ` Yongqiang Yang @ 2011-10-03 13:11 ` Lukas Czerner 2011-10-03 14:18 ` Ted Ts'o 0 siblings, 1 reply; 15+ messages in thread From: Lukas Czerner @ 2011-10-03 13:11 UTC (permalink / raw) To: Yongqiang Yang Cc: Andreas Dilger, Lukas Czerner, Jeff liu, Dave Young, Ted Ts'o, Linux Kernel Mailing List, linux-ext4 [-- Attachment #1: Type: TEXT/PLAIN, Size: 4289 bytes --] On Mon, 3 Oct 2011, Yongqiang Yang wrote: > On Sun, Oct 2, 2011 at 3:59 PM, Andreas Dilger <aedilger@gmail.com> wrote: > > On 2011-10-01, at 11:41 PM, Jeff liu wrote: > >>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <tytso@mit.edu> wrote: > >>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote: > >>>>> Hi, > >>>>> > >>>>> Weird problem, when I build app from source, > >>>>> make; make install > >>>>> run the command, but got "cannot execute binary file" > >>>>> > >>>>> hexdump shows the installed binary is full of zero > >>>>> > >>>>> Is it related to ext4 fiemap problem described below? > >>>>> http://lwn.net/Articles/429349/ > >>>> > >>>> There is general agreement that /bin/cp should not have been relying > >>>> on FIEMAP, and I believe the more recent versions of /bin/cp have > >>>> removed that code by default pending implementation of > >>>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its > >>>> FIEMAP implementation that landed in 2.6.39, and you're using > >>>> 3.1.0-rc6. > >> > >> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based. > > > > My understanding is that cp uses the blocks count to determine whether the file is sparse or not. In the case of delayed allocation (where blocks are not yet allocated, if they are not reflected in the i_blocks count) it might mistakenly think that the file is sparse. > > > > Given the danger of this bug, it is important to ensure ext4 returns DELALLOC extents for pages in the page cache. I think Yongqiang Yang just submitted a patch series to do this for ext4, so it would be important to verify it fixes this problem. > It seemed the patch[ ext4: in fiemap use FIEMAP_EXTENT_LAST flag for > last extent] (http://www.spinics.net/lists/linux-ext4/msg25698.html) > Lukas submitted on FIEMAP which ignores delayed extents beyond the > last allocated block. e.g. AAAHHHHDDDD > A - allocated, H - hole, D - delayed alloc, then the ending delayed > extent is ignored. Oops, you're right. I think that the best solution would be to revert the commit c03f8aa9abdd517477c2021ea1251939b4da49e6 ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap and then fix the original problem with your delayed extent tree solution, where we can easily check not only for next allocated extent, but also for next delayed extent to see if the current one is last or not. Currently, the problem is that at the point we are filling the fiemap extent with fiemap_fill_next_extent() we do not have enough information to say whether the extent is really the last or is not. And currently there is not easy way to check for next delayed extent (which will be fixed with your delayed extent tree). I do not know how "ready" are your patches..Is it possible to wait for them to be ready and fix it in your patch set ? That means, revert the mentioned commit and reimplement fiemap with delayed extent tree. Thanks! -Lukas > > Yongqiang. > > > >>> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap? > >>> > >>>> > >>>>> I finally managed to find the way to reproduce this: > >>>>> just cp a elf binary A to file B, then cp B to file C, then you will get: > >>>>> A == B != C > >>>>> > >>>>> ie. > >>>>> cp /bin/ls ls1 > >>>>> cp ls1 ls2 > >>>>> > >>>>> ls2 will be filled with zero > >>>> > >>>> If you add a "sync" between the two copies, does that work around the > >>>> problem? I bet it will... > >>> > >>> Yes, it works > >>> > >>>> > >>>> My suggestion is to upgrade to a newer version of coreutils that > >>>> doesn't try to use FIEMAP. > >>> > >>> Thanks, will try > >>> > >>>> > >>>> - Ted > >>>> > >>> > >>> > >>> > >>> -- > >>> Regards > >>> Dave > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > >>> the body of a message to majordomo@vger.kernel.org > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > >> the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > Cheers, Andreas > > > > > > > > > > > > > > > > -- ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] copy file result with zero 2011-10-03 13:11 ` Lukas Czerner @ 2011-10-03 14:18 ` Ted Ts'o 2011-10-03 15:51 ` Lukas Czerner 0 siblings, 1 reply; 15+ messages in thread From: Ted Ts'o @ 2011-10-03 14:18 UTC (permalink / raw) To: Lukas Czerner Cc: Yongqiang Yang, Andreas Dilger, Jeff liu, Dave Young, Linux Kernel Mailing List, linux-ext4 On Mon, Oct 03, 2011 at 03:11:30PM +0200, Lukas Czerner wrote: > > Oops, you're right. I think that the best solution would be to revert > the commit > > c03f8aa9abdd517477c2021ea1251939b4da49e6 > ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap > > and then fix the original problem with your delayed extent tree > solution, where we can easily check not only for next allocated extent, > but also for next delayed extent to see if the current one is last or > not. > ... > > I do not know how "ready" are your patches..Is it possible to wait for > them to be ready and fix it in your patch set ? That means, revert the > mentioned commit and reimplement fiemap with delayed extent tree. Sigh, yeah, we need to fix this to avoid the hang in xfstests #252 but users losing data even if the coreutils release was only out there for 13 days is bad juju. I'm working on reviewing the kernel patch backlog this week, and I'll give this series one priority. Thanks to Yongqiang and Lukas for looking into this! - Ted ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] copy file result with zero 2011-10-03 14:18 ` Ted Ts'o @ 2011-10-03 15:51 ` Lukas Czerner 0 siblings, 0 replies; 15+ messages in thread From: Lukas Czerner @ 2011-10-03 15:51 UTC (permalink / raw) To: Ted Ts'o Cc: Lukas Czerner, Yongqiang Yang, Andreas Dilger, Jeff liu, Dave Young, Linux Kernel Mailing List, linux-ext4 On Mon, 3 Oct 2011, Ted Ts'o wrote: > On Mon, Oct 03, 2011 at 03:11:30PM +0200, Lukas Czerner wrote: > > > > Oops, you're right. I think that the best solution would be to revert > > the commit > > > > c03f8aa9abdd517477c2021ea1251939b4da49e6 > > ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap > > > > and then fix the original problem with your delayed extent tree > > solution, where we can easily check not only for next allocated extent, > > but also for next delayed extent to see if the current one is last or > > not. > > ... > > > > I do not know how "ready" are your patches..Is it possible to wait for > > them to be ready and fix it in your patch set ? That means, revert the > > mentioned commit and reimplement fiemap with delayed extent tree. > > Sigh, yeah, we need to fix this to avoid the hang in xfstests #252 but > users losing data even if the coreutils release was only out there for > 13 days is bad juju. > > I'm working on reviewing the kernel patch backlog this week, and I'll > give this series one priority. Actually the series needs to be changed to to fix the problem. I'll comment the appropriate patch. Thanks! -Lukas > > Thanks to Yongqiang and Lukas for looking into this! > > - Ted ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] copy file result with zero 2011-10-02 6:41 ` Jeff liu 2011-10-02 7:59 ` Andreas Dilger @ 2011-10-02 8:02 ` Andreas Dilger 2011-10-02 8:43 ` Dave Young 1 sibling, 1 reply; 15+ messages in thread From: Andreas Dilger @ 2011-10-02 8:02 UTC (permalink / raw) To: Dave Young Cc: Jeff liu, Ted Ts'o, Linux Kernel Mailing List, linux-ext4 development On 2011-10-01, at 11:41 PM, Jeff liu wrote: >> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <tytso@mit.edu> wrote: >>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote: >>>> Weird problem, when I build app from source, >>>> make; make install >>>> run the command, but got "cannot execute binary file" >>>> >>>> hexdump shows the installed binary is full of zero >>>> >>>> Is it related to ext4 fiemap problem described below? >>>> http://lwn.net/Articles/429349/ >>> >>> There is general agreement that /bin/cp should not have been relying >>> on FIEMAP, and I believe the more recent versions of /bin/cp have >>> removed that code by default pending implementation of >>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its >>> FIEMAP implementation that landed in 2.6.39, and you're using >>> 3.1.0-rc6. > > Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based. Are there any distros that are shipping with a version of cp that depends on FIEMAP? That would dramatically increase the severity of this problem, since orders of magnitude more users will hit the problem. Dave, what distro were you seeing this problem on, and had you installed/upgraded your coreutils and/or kernel yourself? >> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap? >> >>> >>>> I finally managed to find the way to reproduce this: >>>> just cp a elf binary A to file B, then cp B to file C, then you will get: >>>> A == B != C >>>> >>>> ie. >>>> cp /bin/ls ls1 >>>> cp ls1 ls2 >>>> >>>> ls2 will be filled with zero >>> >>> If you add a "sync" between the two copies, does that work around the >>> problem? I bet it will... >> >> Yes, it works >> >>> >>> My suggestion is to upgrade to a newer version of coreutils that >>> doesn't try to use FIEMAP. >> >> Thanks, will try >> >>> >>> - Ted >>> >> >> >> >> -- >> Regards >> Dave >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Cheers, Andreas ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] copy file result with zero 2011-10-02 8:02 ` Andreas Dilger @ 2011-10-02 8:43 ` Dave Young 2011-10-03 11:08 ` Pádraig Brady 0 siblings, 1 reply; 15+ messages in thread From: Dave Young @ 2011-10-02 8:43 UTC (permalink / raw) To: Andreas Dilger Cc: Jeff liu, Ted Ts'o, Linux Kernel Mailing List, linux-ext4 development On Sun, Oct 2, 2011 at 4:02 PM, Andreas Dilger <aedilger@gmail.com> wrote: > On 2011-10-01, at 11:41 PM, Jeff liu wrote: >>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <tytso@mit.edu> wrote: >>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote: >>>>> Weird problem, when I build app from source, >>>>> make; make install >>>>> run the command, but got "cannot execute binary file" >>>>> >>>>> hexdump shows the installed binary is full of zero >>>>> >>>>> Is it related to ext4 fiemap problem described below? >>>>> http://lwn.net/Articles/429349/ >>>> >>>> There is general agreement that /bin/cp should not have been relying >>>> on FIEMAP, and I believe the more recent versions of /bin/cp have >>>> removed that code by default pending implementation of >>>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its >>>> FIEMAP implementation that landed in 2.6.39, and you're using >>>> 3.1.0-rc6. >> >> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based. > > Are there any distros that are shipping with a version of cp that depends on FIEMAP? That would dramatically increase the severity of this problem, since orders of magnitude more users will hit the problem. I'm not sure if it depends on FIEMAP, I think it should be not so old. > > Dave, what distro were you seeing this problem on, and had you installed/upgraded your coreutils and/or kernel yourself? Slackware 13.37, coreutils 8.11 kernel is always built from linus's git by myself > >>> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap? >>> >>>> >>>>> I finally managed to find the way to reproduce this: >>>>> just cp a elf binary A to file B, then cp B to file C, then you will get: >>>>> A == B != C >>>>> >>>>> ie. >>>>> cp /bin/ls ls1 >>>>> cp ls1 ls2 >>>>> >>>>> ls2 will be filled with zero >>>> >>>> If you add a "sync" between the two copies, does that work around the >>>> problem? I bet it will... >>> >>> Yes, it works >>> >>>> >>>> My suggestion is to upgrade to a newer version of coreutils that >>>> doesn't try to use FIEMAP. >>> >>> Thanks, will try >>> >>>> >>>> - Ted >>>> >>> >>> >>> >>> -- >>> Regards >>> Dave >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > Cheers, Andreas > > > > > > -- Regards Dave ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [BUG] copy file result with zero 2011-10-02 8:43 ` Dave Young @ 2011-10-03 11:08 ` Pádraig Brady 0 siblings, 0 replies; 15+ messages in thread From: Pádraig Brady @ 2011-10-03 11:08 UTC (permalink / raw) To: Dave Young Cc: Andreas Dilger, Jeff liu, Ted Ts'o, Linux Kernel Mailing List, linux-ext4 development, Coreutils On 10/02/2011 09:43 AM, Dave Young wrote: > On Sun, Oct 2, 2011 at 4:02 PM, Andreas Dilger <aedilger@gmail.com> wrote: >> On 2011-10-01, at 11:41 PM, Jeff liu wrote: >>>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <tytso@mit.edu> wrote: >>>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote: >>>>>> Weird problem, when I build app from source, >>>>>> make; make install >>>>>> run the command, but got "cannot execute binary file" >>>>>> >>>>>> hexdump shows the installed binary is full of zero >>>>>> >>>>>> Is it related to ext4 fiemap problem described below? >>>>>> http://lwn.net/Articles/429349/ >>>>> >>>>> There is general agreement that /bin/cp should not have been relying >>>>> on FIEMAP, and I believe the more recent versions of /bin/cp have >>>>> removed that code by default pending implementation of >>>>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its >>>>> FIEMAP implementation that landed in 2.6.39, and you're using >>>>> 3.1.0-rc6. >>> >>> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based. >> >> Are there any distros that are shipping with a version of cp that depends on FIEMAP? That would dramatically increase the severity of this problem, since orders of magnitude more users will hit the problem. > > I'm not sure if it depends on FIEMAP, I think it should be not so old. > >> >> Dave, what distro were you seeing this problem on, and had you installed/upgraded your coreutils and/or kernel yourself? > > Slackware 13.37, coreutils 8.11 > kernel is always built from linus's git by myself Coreutils 8.11 was only released for 13 days, before 8.12 was released specifically to avoid this issue. Slackware should update. Coreutils 8.12 only uses a fiemap based copy for sparse files, where it will do a sync first. The sparseness heuristic is st_blocks < st_size / st_blksize cheers, Pádraig. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2011-10-03 15:52 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-10-01 14:01 [BUG] copy file result with zero Dave Young 2011-10-01 14:39 ` Ted Ts'o 2011-10-01 23:37 ` Dave Young 2011-10-02 6:41 ` Jeff liu 2011-10-02 7:59 ` Andreas Dilger 2011-10-02 7:14 ` Jeff liu 2011-10-02 8:46 ` Dave Young 2011-10-02 11:54 ` Christoph Hellwig 2011-10-03 9:26 ` Yongqiang Yang 2011-10-03 13:11 ` Lukas Czerner 2011-10-03 14:18 ` Ted Ts'o 2011-10-03 15:51 ` Lukas Czerner 2011-10-02 8:02 ` Andreas Dilger 2011-10-02 8:43 ` Dave Young 2011-10-03 11:08 ` Pádraig Brady
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox