git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Another fast-import/import-tars issue
@ 2007-05-11 20:08 Chris Riddoch
  2007-05-16 16:22 ` [PATCH] import-tars: Use the "Link indicator" to identify directories Johannes Schindelin
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Riddoch @ 2007-05-11 20:08 UTC (permalink / raw)
  To: git

Hi, folks.

I believe I've uncovered an issue in fast-import, but I don't know the
code well enough yet to debug it.  So, I'll produce my evidence and
let others work on finding the solution.  It should be pretty easy to
reproduce.

First, I'm running: 1.5.2.rc1.9.g6644

Grab the tarball of Perl 5.8.8 - http://www.perl.com/CPAN/src/perl-5.8.8.tar.bz2

Note its md5, just so you know it's not corrupted from the outset.
b8c118d4360846829beb30b02a6b91a7  perl-5.8.8.tar.gz
a377c0c67ab43fd96eeec29ce19e8382  perl-5.8.8.tar.bz2

Try this:

$ tar -xjf perl-5.8.8.tar.bz2
$ cd perl-5.8.8
$ git init
$ git add .
$ git commit -a -m "Import from working tree copy"

Now, for convenience of debugging, I have myself a script I call
~/bin/fast-import-filter.sh:

#!/bin/bash
tee fast-import.log | git fast-import --quiet

Then, I have a slightly-changed ~/bin/import-tars script, like so:

20c20,21
< open(FI, '|-', 'git', 'fast-import', '--quiet')
---
> #open(FI, '|-', 'git', 'fast-import', '--quiet')
> open(FI, '|-', 'fast-import-filter.sh')

Now,

$ import-tars.pl ../perl-5.8.8.tar.bz2

Okay, so the trees pointed to by the tips of the master and
import-tars branches *should* be identical here, right?

$ git diff-tree master: import-tars: | wc -l
229

Not so good.

-- 
epistemological humility
  Chris Riddoch

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] import-tars: Use the "Link indicator" to identify directories
  2007-05-11 20:08 Another fast-import/import-tars issue Chris Riddoch
@ 2007-05-16 16:22 ` Johannes Schindelin
  2007-05-16 18:24   ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Johannes Schindelin @ 2007-05-16 16:22 UTC (permalink / raw)
  To: Chris Riddoch; +Cc: git, spearce


Earlier, we used the mode to determine if a name was associated with
a directory. This fails, since some tar programs do not set the mode
correctly. However, the link indicator _has_ to be set correctly.

Noticed by Chris Riddoch.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	On Fri, 11 May 2007, Chris Riddoch wrote:

	> I believe I've uncovered an issue in fast-import, but I don't 
	> know the code well enough yet to debug it.  So, I'll produce my 
	> evidence and let others work on finding the solution.  It should 
	> be pretty easy to reproduce.

	It was easy. Thanks.

	The problem is -- again -- that a directory is overwritten, since 
	it is not recognized as a directory. Earlier, I tried to use the 
	trailing "/" for that. Which fails with your example.

	I actually took the time to research in Wikipedia what should be 
	the correct way to find out if the current item is a directory...

 contrib/fast-import/import-tars.perl |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl
index 1e6fa5a..23aeb25 100755
--- a/contrib/fast-import/import-tars.perl
+++ b/contrib/fast-import/import-tars.perl
@@ -75,7 +75,7 @@ foreach my $tar_file (@ARGV)
 		$mode = oct $mode;
 		$size = oct $size;
 		$mtime = oct $mtime;
-		next if $mode & 0040000;
+		next if $typeflag == 5; # directory
 
 		print FI "blob\n", "mark :$next_mark\n", "data $size\n";
 		while ($size > 0 && read(I, $_, 512) == 512) {
-- 
1.5.2.rc3.2506.ge455

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] import-tars: Use the "Link indicator" to identify directories
  2007-05-16 16:22 ` [PATCH] import-tars: Use the "Link indicator" to identify directories Johannes Schindelin
@ 2007-05-16 18:24   ` Junio C Hamano
  2007-05-16 18:49     ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2007-05-16 18:24 UTC (permalink / raw)
  To: spearce; +Cc: Chris Riddoch, git, Johannes Schindelin

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Earlier, we used the mode to determine if a name was associated with
> a directory. This fails, since some tar programs do not set the mode
> correctly. However, the link indicator _has_ to be set correctly.

> 	The problem is -- again -- that a directory is overwritten, since 
> 	it is not recognized as a directory. Earlier, I tried to use the 
> 	trailing "/" for that. Which fails with your example.
>
> 	I actually took the time to research in Wikipedia what should be 
> 	the correct way to find out if the current item is a directory...

This matches my reading of GNU tar as well.  The patch would not
break a correctly made tar archive, would fix importing archives
created by a broken tar that does not do mode right (but uses the
correct typeflag), _and_ would _break_ archives created by tar
that is broken in a different way, sets the mode right but uses
a wrong typeflag.

I do not know which breakages are more common, and this one
being in contrib/ I do not think it really matters in practice,
but as a principle, I think we should try to adhere to the same
"no regression" policy the kernel folks try to adhere to.  If
something used to work, even if its was by accident or a bug, we
had better have a pretty good reason to break it by a change
that fixes things for other people, _even_ when that other
people outnumber the people who are affected by the regression.

I'd first ask GNU tar maintainer if he knows of existing
implementations of tar that are broken in the latter sense (iow,
sets modes correctly but typeflag incorrectly), as the tarball
extraction codepath would have the exact same issue.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] import-tars: Use the "Link indicator" to identify directories
  2007-05-16 18:24   ` Junio C Hamano
@ 2007-05-16 18:49     ` Junio C Hamano
  0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2007-05-16 18:49 UTC (permalink / raw)
  To: spearce; +Cc: Chris Riddoch, git, Johannes Schindelin

Junio C Hamano <junkio@cox.net> writes:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>> Earlier, we used the mode to determine if a name was associated with
>> a directory. This fails, since some tar programs do not set the mode
>> correctly. However, the link indicator _has_ to be set correctly.

Nah, what was I smoking.  Even gtar seems to give mode="0000775"
(with NUL termination) for directories, so there is no way there
were regressions.  Patch looks good.

	Acked-by: Junio C Hamano <junkio@cox.net>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-05-16 18:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-11 20:08 Another fast-import/import-tars issue Chris Riddoch
2007-05-16 16:22 ` [PATCH] import-tars: Use the "Link indicator" to identify directories Johannes Schindelin
2007-05-16 18:24   ` Junio C Hamano
2007-05-16 18:49     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).