git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neal Kreitzinger <nkreitzinger@gmail.com>
To: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
	Neal Kreitzinger <neal@rsss.com>,
	git@vger.kernel.org
Subject: Re: git-archive and tar options
Date: Mon, 18 Jul 2011 14:31:58 -0500	[thread overview]
Message-ID: <4E248A2E.3090902@gmail.com> (raw)
In-Reply-To: <4E20AA42.7000003@lsrfire.ath.cx>

On 7/15/2011 3:59 PM, René Scharfe wrote:
> Am 15.07.2011 01:30, schrieb Junio C Hamano:
>> Jeff King<peff@peff.net>  writes:
>>
>>>> Why?
>>>>
>>>> The tree you are writing out that way look very different from
>>>>  what is recorded in the commit object. What's the point of
>>>> introducing confusion by allowing many tarballs with different
>>>>  contents written from the same commits with such tweaks all
>>>> labelled with the same pax header?
>>>
>>> See my later message. I think it depends on how the embedded id
>>> is used. Is it to say "this represents the tree of this git
>>> commit"? Or is it to help people who later have a tarball and
>>> have no clue which commit it might have come from?
>>
>> People, who have no clue which part of the subtree was extract and
>>  what leading path was added, would still have to wonder where the
>>  tree came from even with the embedded id. Without your patch, if
>> the tarball has an embedded id, wouldn't they at least be able to
>> assume it is the whole thing of that commit? If you label a
>> randomly mutated tree with the same label, you cannot tell the
>> genuine one from manipulated ones.
>>
>> Not that I have strong opinions on this, either, but that is what I
>> meant by "_introducing_" confusion.
>
> When we started to write the ID into generated archives, there was
> only git-tar-tree and no<rev>:<path>  syntax.  It would write the ID
>  only if it was given a commit and not if it got a tree or if the
> user started it from a subdirectory.  The result was that only the
> full tree of a commit was branded with the commit ID.
>
> Now we have git archive, a more flexible command line syntax all
> around, path limiting as well as attributes that can affect the
> contents of the files in the archive.  Back then the commmit ID was
> sufficient as a concise and canonical label of the archive contents,
>  but now things are a bit more complicated.
>
> Which use cases are we aiming for?  Do we want to include all of the
> command line arguments (with revs resolved to SHA1-IDs)?  Only those
> that modify archive contents?  And any applied attributes?  Or do we
> want to get stricter and only write the commit ID if a full unchanged
> tree of a commit is being archived?
>
In regards to the use cases you enumerated, I think logging the command
line syntax along with the appropriate ref context (HEAD value, etc)
would document exactly what's in the archive.

In regards to use cases in general, my impression is that git-archive is 
for producing archives useful for deployment.  The target deployed 
structure may vary so expecting the source git repo to reflect this is 
unfeasable.  It seems like utilizing the local tar installation would 
effect the necessary transformations. I'm not sure what the source and 
target tar version disparity problems might me.

A practical problem with the pax header is that its only useful if you
still have the archive.  Archives usually get deleted after being
extracted.  Therefore, an option to also generate (and add to the 
archive) an automatic "VERSION.TXT" file of some sort which specifies 
the context of the archive would be much more useful.  It would need its 
own --prefix option because oftentimes it would be dynamically generated 
based on the git-archive request.

Another use case is that it seems like there should also be the option 
to only tar the objects changed between a specified range of commits. 
However, I'm not sure if tar can handle deletions (moves, deletions, 
renames) upon extraction in this context.

I can see that my use cases are something that I can script myself, but 
to do so it seems like I would be better off using a non-bare repo 
checkout as an intermediary.  If that is what I am expected to do then I 
am not sure what the usefulness of git-archive is intended to be.  Maybe 
I don't understand what others use it for.

v/r,
neal

  reply	other threads:[~2011-07-18 19:32 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-13 23:34 git-archive and tar options Neal Kreitzinger
2011-07-14  1:56 ` Jeff King
2011-07-14 17:16   ` René Scharfe
2011-07-14 17:27     ` Jeff King
2011-07-14 17:45       ` René Scharfe
2011-07-14 18:18         ` Jeff King
2011-07-14 19:12           ` Jakub Narebski
2011-07-14 21:23       ` Junio C Hamano
2011-07-14 21:25         ` Jeff King
2011-07-14 23:30           ` Junio C Hamano
2011-07-15 20:59             ` René Scharfe
2011-07-18 19:31               ` Neal Kreitzinger [this message]
2011-07-18 20:50                 ` René Scharfe
2011-07-14 21:38         ` Jakub Narebski
2011-07-18 18:13       ` Neal Kreitzinger
2011-07-18 20:50         ` René Scharfe
2011-07-19  0:12           ` Neal Kreitzinger
2011-07-19 17:56             ` René Scharfe
2011-07-21  2:13               ` Neal Kreitzinger
2011-07-21 16:59                 ` Neal Kreitzinger
2011-07-14 17:48   ` Andreas Schwab
2011-07-19 20:10     ` Sylvain Rabot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E248A2E.3090902@gmail.com \
    --to=nkreitzinger@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=neal@rsss.com \
    --cc=peff@peff.net \
    --cc=rene.scharfe@lsrfire.ath.cx \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).