All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Nick Townsend <nick.townsend@mac.com>
Cc: "Git List" <git@vger.kernel.org>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"René Scharfe" <l.s.r@web.de>,
	"Jens Lehmann" <Jens.Lehmann@web.de>,
	"Heiko Voigt" <hvoigt@hvoigt.net>
Subject: Re: [PATCH] Additional git-archive tests
Date: Tue, 10 Dec 2013 10:48:47 -0800	[thread overview]
Message-ID: <xmqq8uvsv8xc.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <3943DF12-8FE8-4664-8901-C243D588BE82@mac.com> (Nick Townsend's message of "Mon, 09 Dec 2013 21:26:42 -0800")

Nick Townsend <nick.townsend@mac.com> writes:

>> In other words, are all these new tests expected to pass?
>
> Sorry about that - the first four tests do pass with 1.8.5, the last
> three tests do not currently pass.

OK.

>> So I do not think it is expected to accept tree object names with
>> the HEAD:<path> style syntax, if the user expects a predictable and
>> consistent result.  The third step above attempts to make sure that
>> you name a tree-ish that corresponds to the top-level of the
>> project, i.e. with no <path>.
>> 
> That’s not quite right - paths do work from the root directory - see tests.

I know that they work from the very top, but they _only_ happens to
work from the very top.  As I do not see the code is designed to
compare the prefix (i.e. the subdirectory you are in) and the path
that is tucked after the name of the top-level tree object with a
colon (e.g. path in "HEAD:path") and adjust these two accordingly, I
do not think it was designed to deal with that at all.  The code
just assumes that the tree object corresponds to the top-level, and
the adjustment is done solely by adding the prefix when limiting the
output by paths.

This observation *is* very right. Not designing to deal with that
case may or may not be right, but that is a different issue.

> It was this very useful capability that I sought to add and generalized
> the code to do. 
>> What seems to be supported are:
>> 
>>    cd a && git archive HEAD ;# archives HEAD:a tree
>>    cd a && git archive HEAD -- b ;# archives a/b/ part of HEAD:a as b/
>> 
>> Specifically, it appears that HEAD:./b, HEAD:b etc. are not designed
>> to work, at least to me.
>> ...
>> I am not saying that these should _not_ work, but it is unclear what
>> it means to "work".  For example, what should this do?
>> 
>>    cd a && git archive HEAD:./b $pathspec
>> 
> I think we can clear this up by documenting the cases and choosing
> sensible behaviour _for git-archive_ ie. what people might expect.

I am afraid that this topic is not limited to "git-archive";
"sensible behaviour _for git-archive_" will end up being far from
sufficient.  I would not be surprised if "ls-tree" shares the same
issue, for example.  If you had "subdir/subsubdir/path" (and other
paths) tracked in your project,

    cd subdir
    git ls-tree HEAD:./subsubdir path

is likely to have the same issue as

    cd subdir
    git archive HEAD:./subsubdir path

I would think.

> Here is a suggestion:
>
> a. The pathspec is used as a selector to include things in the archive.
> it is applied after the cwd and treeish selection.

That would mean that doing this in the top-level:

        git archive HEAD:subdir/subsubdir path<TAB>

will not auto-complete the pathname.  In all other Git commands,
vanilla pathspecs (at least the ones that do not use the magic
escape hatch ":/path", etc.) are relative to your current directory,
and I do not think we want a single command to work in an
inconsistent way with respect to that.

> b. The current working directory (if present) prefixes a path in the object
> if one is present.

That is already done at a lower-level get_sha1_with_context(), I
think.

	cd subdir
	git archive HEAD:./subsubdir

will try to use tree subdir/subsubdir in the HEAD commit (but due to
the prefix logic that is not preprared to use such a tree object,
you will get "current working directory is untracked" error).

> c. The path in the object (if present) is prefixed by the cwd (if present) and
> used to select items for archiving. However the composite path so created
> *is not present* in the archive - ie. the leading components are stripped.
> (This is very useful behaviour for creating archives without leading paths)
>
> d. As currently the ―prefix option (not the prefix from setup_git_directory)
>  is prepended to all entries in the archive.
>
> These correspond to the use cases that I have for git archive - extracting all
> or part of a multiple submodule tree into a tgz file, with or without leading
> directories.
>
>> The extended SHA-1 expression HEAD:./b in the subdirectory a/ is
>> interpreted by get_sha1_with_context_1() to be the name of the tree
>> object at path "a/b" in the commit HEAD.  Further, you are giving a
>> pathspec while in a subdirectory a/ to the command.  What should
>> that pathspec be relative to?
>> 
>> In a normal Git command, the pathspec always is relative to the
>> current subdirectory, so, the way to learn about the tree object
>> a/b/c in the HEAD while in subdirectory a/ would be:
>> 
>>    cd a && git ls-tree HEAD b/c
>> 
>> But what should the command line for archive to grab HEAD:a/b/c be?
>> It feels wrong to say:
>> 
>>    cd a && git archive HEAD:./b b/c
> It’s clear to me that if you are in a subdirectory, that is an implicit prefix on the 
> ./b so you retrieve HEAD:a/b  which then archives everything in b without the
> leading a/b. The pathspec is wrong (including the b) and this archive is empty. 
>> 
>> and it also feels wrong to say
>> 
>>    cd a && git archive HEAD:./b c
>> 
> That looks fine to me given the rules suggested above.
>
> Also git-parse manages
> to return the correct thing in this case, so I’d expect this to work.
>
>> No matter what we would do, we should behave consistently with this
>> case:
>> 
>>    treeish=$(git rev-parse HEAD:a/b) &&
>>    cd a &&
>>    git archive $treeish -- $pathspec
>> 
> Well this will fail both in the existing case (‘fatal: current
> working directory is untracked’) and with the scheme proposed
> above (‘fatal: invalid object name $treeish:a/‘)

The thing is the "current working directory is untracked" is a
fallout from the current archive code not expecting to see a
tree-ish specifed in HEAD:<path> style to interact with $cwd (aka
prefix).  Go back to the description in how parse_treeish_arg()
works in the message you are responding to (or to archive.c).  It
says "if we have prefix, grab the tree object that corresponds to
the prefix in the given tree-ish".  When "subdir" is your current
directory, if you give HEAD as the tree-ish, this logic will give
you the subtree HEAD:subdir; if you give anything else that does not
correspond to the top-level of the project, it will not be able to
find such a subtree (by definition, any tree-ish that does not
correspond to the top-level cannot contain "subdir" tree object that
corresponds to "subdir" directory of the top-level; it may contain a
random directory that happens to be named "subdir", but that will be
a different directory from the "subdir" you see at the top-level).

I do not think a workable compromise exists, but one that is closest
to be workable, even though I do not think it is such a great idea,
would be to apply the add "prefix" logic _only_ when the given
tree-ish is known to be at the top-level (e.g. when it was
originally given as a commit).

      reply	other threads:[~2013-12-10 18:49 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-03  0:10 [PATCH] Improvements to git-archive tests and add_submodule_odb() Nick Townsend
2013-12-03  0:14 ` Nick Townsend
2013-12-03 17:52   ` Junio C Hamano
2013-12-03 18:18   ` Heiko Voigt
2013-12-03 18:39     ` Re* " Junio C Hamano
2013-12-03 20:42       ` Heiko Voigt
2013-12-03  0:16 ` Nick Townsend
2013-12-03  9:26   ` Eric Sunshine
2013-12-03 17:54     ` Junio C Hamano
2013-12-05  2:49       ` [PATCH] Additional git-archive tests Nick Townsend
2013-12-05 19:52         ` Junio C Hamano
2013-12-10  5:26           ` Nick Townsend
2013-12-10 18:48             ` Junio C Hamano [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq8uvsv8xc.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=Jens.Lehmann@web.de \
    --cc=git@vger.kernel.org \
    --cc=hvoigt@hvoigt.net \
    --cc=l.s.r@web.de \
    --cc=nick.townsend@mac.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.