All of lore.kernel.org
 help / color / mirror / Atom feed
From: fuz@fuz.su
To: git@vger.kernel.org
Subject: Re: git archive should use vendor extension in pax header
Date: Thu, 28 Jan 2016 00:25:34 +0100	[thread overview]
Message-ID: <20160127232534.GA5435@fuz.su> (raw)
In-Reply-To: <56A7EDE1.1020909@web.de>

On Tue, Jan 26, 2016 at 11:06:25PM +0100, René Scharfe wrote:
> Am 24.01.2016 um 16:59 schrieb fuz@fuz.su:
> >Right now, git archive creates a pax global header of the form
> >
> >     comment=57ca140635bf157354124e4e4b3c8e1bde2832f1
> >
> >in tar archives it creates. This is suboptimal as as comments are
> >specified to be ignored by extraction software. It is impossible to
> >find out in an automatic way (short of guessing) that this is supposed
> >to be a commit hash.
> 
> This is only a problem if you don't know how a given tar files was
> created (or modified later).  How did you get into this situation?
> Or in other words: Please tell me more about your use case.

My situation is that I'm interested in knowing if an archive was created
by git so I can find out where the corresponding repository is and find
out which commit this archive was created from.  Right now the only way
is to open a hex editor or as archiving software is instructed to ignore
the content of comment headers.  This is clearly a suboptimal situation.

> >It would be much more useful if git created a
> >custom key. As per POSIX suggestions, something like this would be
> >appropriate:
> >
> >     GIT.commit=57ca140635bf157354124e4e4b3c8e1bde2832f1
> 
> This would be included in addition to the comment in order to avoid
> breaking existing users, I guess.

Good point.  I'm not sure how many user use the comment header at all.

> If you have a random archive and want to know if it was generated by
> git then your next question might be which options and substitutions
> were used.  That reminds me of this thread regarding verifiable
> archives:
> 
>     http://article.gmane.org/gmane.comp.version-control.git/240244

Good point.  Something like this should be enough to be enough to have
reproducable archives if archives with a tree ID were to have a time
stamp of 0 (1970-01-01) instead of the current date:

    comment=...    (for compatibility)
    GIT.commit=... (like comment)
    GIT.umask=...  (tar.umask)
    GIT.prefix=... (--prefix=)
    GIT.path=...   (see below
    GIT.export-subst=1 (in extended header instead of global header)

A different key such as GIT.treeish might be appropriate.  The
GIT.export-subst key should be set only for those files where a
substitution has taken place. Maybe there should also be an
GIT.original-name key.

An option GIT.export-ignore is not required.  Instead it would be more
useful to have a special file type G (for git) with the convention that
the file name .gitattributes means “attributes that apply to this git
archive.”

The GIT.path option holds the paths that are being archived. It is a bit
tricky to get right.  The intent of POSIX pax headers is that each key
is an attribute that applies to a series of files.  In the case of a
global header, each key applies until it is overridden with a new
header or with a local header.  A GIT.path key should only apply to the
files that correspond to this path operant to git archive.  Thus, a new
GIT.path should be written frequently.  There should always be at least
one GIT.path.

It might be a good idea to be able to control the kind of metadata git
adds to the archive as to be able to not leak any confidential
information with git archive.  If you are interested I can try to make a
specification for these headers.

Yours sincerely,
Robert Clausecker

-- 
()  ascii ribbon campaign - for an 8-bit clean world 
/\  - against html email  - against proprietary attachments

  reply	other threads:[~2016-01-27 23:20 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-24 15:59 git archive should use vendor extension in pax header fuz
2016-01-26 22:06 ` René Scharfe
2016-01-27 23:25   ` fuz [this message]
     [not found]   ` <20160127114634.GA1976@fuz.su>
     [not found]     ` <56A92913.3030909@web.de>
2016-01-27 23:45       ` fuz
2016-01-28  8:13         ` Johannes Schindelin
2016-01-28  9:14           ` fuz
2016-02-06 13:23         ` René Scharfe
2016-02-06 14:57           ` fuz
2016-02-15 20:25             ` René Scharfe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160127232534.GA5435@fuz.su \
    --to=fuz@fuz.su \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.