* Re: Sources for 3.18-rc1 not uploaded
[not found] <20141020115943.GA27144@gmail.com>
@ 2014-10-20 15:25 ` Linus Torvalds
2014-10-20 18:28 ` Junio C Hamano
2014-10-20 22:28 ` brian m. carlson
0 siblings, 2 replies; 18+ messages in thread
From: Linus Torvalds @ 2014-10-20 15:25 UTC (permalink / raw)
To: Konstantin Ryabitsev, Junio C Hamano, brian m. carlson
Cc: infra-steering, Git Mailing List
[-- Attachment #1: Type: text/plain, Size: 1458 bytes --]
Junio, Brian,
it seems that the stability of the "git tar" output is broken.
On Mon, Oct 20, 2014 at 4:59 AM, Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
>
> Looks like 3.18-rc1 upload didn't work:
>
> This is why the front page still lists 3.17 as the latest mainline. Want
> to try again?
Ok, tried again, and failed again.
> If that still doesn't work, you may have to use version 1.7 of git when
> generating the tarball and signature -- I recall Greg having a similar
> problem in the past.
Ugh, yes, that seems to be it. Current git generates different
tar-files than older releases do:
tar-1.7.9.7 tar-cur differ: byte 107, line 1
and a quick bisection shows that it is due to commit 10f343ea814f
("archive: honor tar.umask even for pax headers") in the current git
development version.
Junio, quite frankly, I don't think that that fix was a good idea. I'd
suggest having a *separate* umask for the pax headers, so that we do
not break this long-lasting stability of "git archive" output in ways
that are unfixable and not compatible. kernel.org has relied (for a
*long* time) on being able to just upload the signature of the
resulting tar-file, because both sides can generate the same tar-fiel
bit-for-bit.
So instead of using "tar_umask", please make it use "tar_pax_umask",
and have that default to 000. Ok?
Something like the attached patch.
Or just revert 10f343ea814f entirely.
Linus
[-- Attachment #2: 0001-Don-t-use-the-default-tar.umask-for-pax-headers.patch --]
[-- Type: text/x-patch, Size: 2392 bytes --]
From d5ca7ae0a34e31c48397f59b03ecabda7c5c40b2 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 20 Oct 2014 08:21:38 -0700
Subject: [PATCH] Don't use the default 'tar.umask' for pax headers
That wasn't the original behavior, and doing so breaks the fact that
tar-files are bit-for-bit compatible across git versions.
If you really want to work around broken receiving tar implementations
(dubious, we've not needed to do so before), use "[tar] paxumask" in the
git config file. Or maybe we could expose some command line flag to do
so. But don't break existing format compatibility for dubious gains.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
archive-tar.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/archive-tar.c b/archive-tar.c
index df2f4c8a6437..40139ea4ee4e 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -14,6 +14,7 @@ static char block[BLOCKSIZE];
static unsigned long offset;
static int tar_umask = 002;
+static int tar_pax_umask = 000;
static int write_tar_filter_archive(const struct archiver *ar,
struct archiver_args *args);
@@ -192,7 +193,7 @@ static int write_extended_header(struct archiver_args *args,
unsigned int mode;
memset(&header, 0, sizeof(header));
*header.typeflag = TYPEFLAG_EXT_HEADER;
- mode = 0100666 & ~tar_umask;
+ mode = 0100666 & ~tar_pax_umask;
sprintf(header.name, "%s.paxheader", sha1_to_hex(sha1));
prepare_header(args, &header, mode, size);
write_blocked(&header, sizeof(header));
@@ -300,7 +301,7 @@ static int write_global_extended_header(struct archiver_args *args)
strbuf_append_ext_header(&ext_header, "comment", sha1_to_hex(sha1), 40);
memset(&header, 0, sizeof(header));
*header.typeflag = TYPEFLAG_GLOBAL_HEADER;
- mode = 0100666 & ~tar_umask;
+ mode = 0100666 & ~tar_pax_umask;
strcpy(header.name, "pax_global_header");
prepare_header(args, &header, mode, ext_header.len);
write_blocked(&header, sizeof(header));
@@ -374,6 +375,15 @@ static int git_tar_config(const char *var, const char *value, void *cb)
return 0;
}
+ if (!strcmp(var, "tar.paxumask")) {
+ if (value && !strcmp(value, "user")) {
+ tar_pax_umask = umask(0);
+ } else {
+ tar_pax_umask = git_config_int(var, value);
+ }
+ return 0;
+ }
+
return tar_filter_config(var, value, cb);
}
--
2.1.2.330.g565301e
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-20 15:25 ` Sources for 3.18-rc1 not uploaded Linus Torvalds
@ 2014-10-20 18:28 ` Junio C Hamano
2014-10-20 18:37 ` Konstantin Ryabitsev
2014-10-20 22:28 ` brian m. carlson
1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2014-10-20 18:28 UTC (permalink / raw)
To: Linus Torvalds
Cc: Konstantin Ryabitsev, brian m. carlson, infra-steering,
Git Mailing List
Linus Torvalds <torvalds@linux-foundation.org> writes:
> Junio, Brian,
>
> it seems that the stability of the "git tar" output is broken.
>
> On Mon, Oct 20, 2014 at 4:59 AM, Konstantin Ryabitsev
> <konstantin@linuxfoundation.org> wrote:
>>
>> Looks like 3.18-rc1 upload didn't work:
>>
>> This is why the front page still lists 3.17 as the latest mainline. Want
>> to try again?
>
> Ok, tried again, and failed again.
>
>> If that still doesn't work, you may have to use version 1.7 of git when
>> generating the tarball and signature -- I recall Greg having a similar
>> problem in the past.
>
> Ugh, yes, that seems to be it. Current git generates different
> tar-files than older releases do:
>
> tar-1.7.9.7 tar-cur differ: byte 107, line 1
>
> and a quick bisection shows that it is due to commit 10f343ea814f
> ("archive: honor tar.umask even for pax headers") in the current git
> development version.
>
> Junio, quite frankly, I don't think that that fix was a good idea. I'd
> suggest having a *separate* umask for the pax headers, so that we do
> not break this long-lasting stability of "git archive" output in ways
> that are unfixable and not compatible. kernel.org has relied (for a
> *long* time) on being able to just upload the signature of the
> resulting tar-file, because both sides can generate the same tar-fiel
> bit-for-bit.
>
> So instead of using "tar_umask", please make it use "tar_pax_umask",
> and have that default to 000. Ok?
>
> Something like the attached patch.
>
> Or just revert 10f343ea814f entirely.
My preference for this particular one however is to simply revert
it. I do not see much point in bending backwards to treat older
implementations of tar that do not understand extended pax headers
very specially by adding a separate option or configuration, even
though I wouldn't have minded if the original implementation were to
apply the same umask for these entries that look like "dummy files"
to them.
I have to wonder why 10f343ea (archive: honor tar.umask even for pax
headers, 2014-08-03) is a problem but an earlier change v1.8.1.1~8^2
(archive-tar: split long paths more carefully, 2013-01-05), which
also should have broken bit-for-bit compatibility, went unnoticed,
though. What I am getting at is that correcting past mistakes in
the output should not be forbidden unconditionally with a complaint
like this.
If 10f343ea were an important fix, then my preference would have
been to instead add "tar_ignore_umask_in_pax_header" to allow those
who care more about bit-for-bit compatibility with older broken
versions than correctness to conditionally disable its code. But I
do not think it is, so my preference isn't.
Thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-20 18:28 ` Junio C Hamano
@ 2014-10-20 18:37 ` Konstantin Ryabitsev
2014-10-20 19:43 ` Junio C Hamano
2014-10-20 21:52 ` Greg KH
0 siblings, 2 replies; 18+ messages in thread
From: Konstantin Ryabitsev @ 2014-10-20 18:37 UTC (permalink / raw)
To: Junio C Hamano, Linus Torvalds
Cc: brian m. carlson, infra-steering, Git Mailing List
On 20/10/14 02:28 PM, Junio C Hamano wrote:
> I have to wonder why 10f343ea (archive: honor tar.umask even for pax
> headers, 2014-08-03) is a problem but an earlier change v1.8.1.1~8^2
> (archive-tar: split long paths more carefully, 2013-01-05), which
> also should have broken bit-for-bit compatibility, went unnoticed,
> though. What I am getting at is that correcting past mistakes in
> the output should not be forbidden unconditionally with a complaint
> like this.
I think Greg actually ran into that one, and uses a separate 1.7 git
tree for this reason.
I can update our servers to git 2.1 (which most of them already have),
which should help with previous incompatibilities -- but not the future
ones obviously. :)
-K
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-20 18:37 ` Konstantin Ryabitsev
@ 2014-10-20 19:43 ` Junio C Hamano
2014-10-20 21:52 ` Greg KH
1 sibling, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2014-10-20 19:43 UTC (permalink / raw)
To: Konstantin Ryabitsev
Cc: Linus Torvalds, brian m. carlson, infra-steering,
Git Mailing List
Konstantin Ryabitsev <konstantin@linuxfoundation.org> writes:
> On 20/10/14 02:28 PM, Junio C Hamano wrote:
>> I have to wonder why 10f343ea (archive: honor tar.umask even for pax
>> headers, 2014-08-03) is a problem but an earlier change v1.8.1.1~8^2
>> (archive-tar: split long paths more carefully, 2013-01-05), which
>> also should have broken bit-for-bit compatibility, went unnoticed,
>> though. What I am getting at is that correcting past mistakes in
>> the output should not be forbidden unconditionally with a complaint
>> like this.
>
> I think Greg actually ran into that one, and uses a separate 1.7 git
> tree for this reason.
>
> I can update our servers to git 2.1 (which most of them already have),
> which should help with previous incompatibilities -- but not the future
> ones obviously. :)
Updating to 2.1 will hopefully correct the change in v1.8.1.1~8^2,
and will break Greg and friends who stick to 1.7 for that reason,
though.
The "breakage" in 10f343ea was only in the 'master' branch and
upwards, which is not yet released in any tagged version, and I just
reverted it from my tree, so people on the cutting edge will be okay
in a short order.
Thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-20 18:37 ` Konstantin Ryabitsev
2014-10-20 19:43 ` Junio C Hamano
@ 2014-10-20 21:52 ` Greg KH
1 sibling, 0 replies; 18+ messages in thread
From: Greg KH @ 2014-10-20 21:52 UTC (permalink / raw)
To: Konstantin Ryabitsev
Cc: Junio C Hamano, Linus Torvalds, brian m. carlson, infra-steering,
Git Mailing List
On Mon, Oct 20, 2014 at 02:37:09PM -0400, Konstantin Ryabitsev wrote:
> On 20/10/14 02:28 PM, Junio C Hamano wrote:
> > I have to wonder why 10f343ea (archive: honor tar.umask even for pax
> > headers, 2014-08-03) is a problem but an earlier change v1.8.1.1~8^2
> > (archive-tar: split long paths more carefully, 2013-01-05), which
> > also should have broken bit-for-bit compatibility, went unnoticed,
> > though. What I am getting at is that correcting past mistakes in
> > the output should not be forbidden unconditionally with a complaint
> > like this.
>
> I think Greg actually ran into that one, and uses a separate 1.7 git
> tree for this reason.
I used to have to do this for the 3.0-stable kernel as one of the files
in it ran into the "very long path" problem. I just ran the latest
version of git with that one commit reverted and all was fine.
After 3.0 was done, I just dropped that patch from my local version and
have been running with the latest git version of git with no problems.
> I can update our servers to git 2.1 (which most of them already have),
> which should help with previous incompatibilities -- but not the future
> ones obviously. :)
I thought you already did this. Or was that only the public facing git
servers?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-20 15:25 ` Sources for 3.18-rc1 not uploaded Linus Torvalds
2014-10-20 18:28 ` Junio C Hamano
@ 2014-10-20 22:28 ` brian m. carlson
2014-10-20 23:17 ` Linus Torvalds
2014-10-20 23:44 ` Konstantin Ryabitsev
1 sibling, 2 replies; 18+ messages in thread
From: brian m. carlson @ 2014-10-20 22:28 UTC (permalink / raw)
To: Linus Torvalds
Cc: Konstantin Ryabitsev, Junio C Hamano, infra-steering,
Git Mailing List
[-- Attachment #1: Type: text/plain, Size: 1561 bytes --]
On Mon, Oct 20, 2014 at 08:25:59AM -0700, Linus Torvalds wrote:
> Junio, Brian,
>
> it seems that the stability of the "git tar" output is broken.
It doesn't appear that the stability of git archive --format=tar is
documented anywhere. Given that, it doesn't seem reasonable to expect
that any tar implementation produces bit-for-bit compatible output
between versions. After all, look at all the contortions that Debian
has had to go through to keep pristine-tar working.
> Junio, quite frankly, I don't think that that fix was a good idea. I'd
> suggest having a *separate* umask for the pax headers, so that we do
> not break this long-lasting stability of "git archive" output in ways
> that are unfixable and not compatible. kernel.org has relied (for a
> *long* time) on being able to just upload the signature of the
> resulting tar-file, because both sides can generate the same tar-fiel
> bit-for-bit.
It sounds like kernel.org has a bug, then. Perhaps that's the
appropriate place to fix the issue.
The issue I fixed is that leaving world-writable files around on disk is
a great way for people to cause mischief (for example, by filling up
other users' quotas), and some tar implementations and all Linux pax
implementations extract the pax headers into the working directory, and
that's often /tmp.
--
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-20 22:28 ` brian m. carlson
@ 2014-10-20 23:17 ` Linus Torvalds
2014-10-21 8:08 ` Michael J Gruber
2014-10-20 23:44 ` Konstantin Ryabitsev
1 sibling, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2014-10-20 23:17 UTC (permalink / raw)
To: Linus Torvalds, Konstantin Ryabitsev, Junio C Hamano,
infra-steering, Git Mailing List
On Mon, Oct 20, 2014 at 3:28 PM, brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> It doesn't appear that the stability of git archive --format=tar is
> documented anywhere. Given that, it doesn't seem reasonable to expect
> that any tar implementation produces bit-for-bit compatible output
> between versions.
The kernel has simple stability rules: if it breaks users, it gets
fixed or reverted. That is a damn good rule.
I realize that some other projects are crap, and don't care about
their users. I hope and believe that git is not in that sad group.
The whole "it's not documented" excuse is pure and utter bollocks.
Users don't care. And stability of data should be *expected*, not need
some random documentation entry to make it explicit.
Linus
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-20 22:28 ` brian m. carlson
2014-10-20 23:17 ` Linus Torvalds
@ 2014-10-20 23:44 ` Konstantin Ryabitsev
2014-10-21 18:59 ` Junio C Hamano
1 sibling, 1 reply; 18+ messages in thread
From: Konstantin Ryabitsev @ 2014-10-20 23:44 UTC (permalink / raw)
To: Linus Torvalds, Junio C Hamano, infra-steering, Git Mailing List
[-- Attachment #1: Type: text/plain, Size: 1522 bytes --]
On 20/10/14 06:28 PM, brian m. carlson wrote:
>> Junio, quite frankly, I don't think that that fix was a good idea. I'd
>> > suggest having a *separate* umask for the pax headers, so that we do
>> > not break this long-lasting stability of "git archive" output in ways
>> > that are unfixable and not compatible. kernel.org has relied (for a
>> > *long* time) on being able to just upload the signature of the
>> > resulting tar-file, because both sides can generate the same tar-fiel
>> > bit-for-bit.
> It sounds like kernel.org has a bug, then. Perhaps that's the
> appropriate place to fix the issue.
It's not a bug, it's a feature (TM). KUP relies on git-archive's ability
to create identical tar archives across platforms and versions. The
benefit is that Linus or Greg can create a detached PGP signature
against a tarball created from "git archive [tag]" on their system, and
just tell kup to create the same archive remotely, thus saving them the
trouble of uploading 80Mb each time they cut a release.
With their frequent travel to places where upload bandwidth is both slow
and unreliable, this ability to not have to upload hundreds of Mbs each
time they cut a release is very handy and certainly helps keep kernel
releases on schedule.
So, while it's fair to point out that git-archive was never intended to
always create bit-for-bit identical outputs, it would be *very nice* if
this remained in place, as at least one large-ish deployment (us) finds
it really handy.
-K
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 538 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-20 23:17 ` Linus Torvalds
@ 2014-10-21 8:08 ` Michael J Gruber
2014-10-21 16:25 ` Linus Torvalds
2014-10-21 18:14 ` Junio C Hamano
0 siblings, 2 replies; 18+ messages in thread
From: Michael J Gruber @ 2014-10-21 8:08 UTC (permalink / raw)
To: Linus Torvalds, Konstantin Ryabitsev, Junio C Hamano,
infra-steering, Git Mailing List
Linus Torvalds schrieb am 21.10.2014 um 01:17:
> On Mon, Oct 20, 2014 at 3:28 PM, brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
>>
>> It doesn't appear that the stability of git archive --format=tar is
>> documented anywhere. Given that, it doesn't seem reasonable to expect
>> that any tar implementation produces bit-for-bit compatible output
>> between versions.
>
> The kernel has simple stability rules: if it breaks users, it gets
> fixed or reverted. That is a damn good rule.
>
> I realize that some other projects are crap, and don't care about
> their users. I hope and believe that git is not in that sad group.
>
> The whole "it's not documented" excuse is pure and utter bollocks.
> Users don't care. And stability of data should be *expected*, not need
> some random documentation entry to make it explicit.
>
> Linus
>
Linus, with all due respect, this is not the LKML, so please watch your
tone over here on the git list (and keep ranting on LKML however you want).
Brian made a very valid point about what his patch was trying to fix -
after all that is why it was applied. Konstantin made a very valid point
about why the existing behavior is useful for KUP. Interestingly, both
cared about the users of git, just different kinds users.
Git is probably one of the most conservative projects regarding
backwards compatibility and heeding users' expectations (sometimes to my
own dismay). That being said, we distinguish between justified
expectations and those without a solid base - which is why we have
porcelain vs. plumbing, for example, to make clear which part of the ui
is stable. (Yeah, I know you know, but you didn't argue as if you did.)
"data" in git is stable. "data exports" by git are as stable as the
output format is intrinsically or due to the (hopefully documented) way
git produces it.
Unfortunately, the git archive doc clearly says that the umask is
applied to all archive entries. And that clearly wasn't the case (for
extended metadata headers) before Brian's fix.
Brian: How old is the newest tar that get's the extended metadata
headers wrong? If those tars are a "real concern" then we should
probably do the extra pax_umask as suggested by Linus, but have the
default protect the "unknowing users" and give the "knowing users" that
config knob to twitch (sorry, Linus). Otherwise a revert is in order.
Michael
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-21 8:08 ` Michael J Gruber
@ 2014-10-21 16:25 ` Linus Torvalds
2014-10-21 17:25 ` David Kastrup
2014-10-21 18:14 ` Junio C Hamano
1 sibling, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2014-10-21 16:25 UTC (permalink / raw)
To: Michael J Gruber
Cc: Konstantin Ryabitsev, Junio C Hamano, infra-steering,
Git Mailing List
On Tue, Oct 21, 2014 at 1:08 AM, Michael J Gruber
<git@drmicha.warpmail.net> wrote:
>
> Unfortunately, the git archive doc clearly says that the umask is
> applied to all archive entries. And that clearly wasn't the case (for
> extended metadata headers) before Brian's fix.
Hey, it's time for another round of the world-famous "Captain Obvious
Quiz Game"! Yay!
The questions these week are:
(1) "If reality and documentation do not match, where is the bug?"
(a) Documentation is buggy
(b) Reality is buggy
(2) "Where would you put the horse in relationship to a horse-drawn carriage?"
(a) in front
(b) in the carriage
Now, if you answered (a) to both these questions, and had this been a
real quiz show, you might have been a winner and the happy new owner
of a remote-controlled four-slice toaster with a fancy digital timer.
Sadly, this was just a dry-run for the real thing, to give people a
quick taste of the world-famous "Captain Obvious Quiz Game". I hope
you tune in next week for our exciting all-new questions.
Linus
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-21 16:25 ` Linus Torvalds
@ 2014-10-21 17:25 ` David Kastrup
0 siblings, 0 replies; 18+ messages in thread
From: David Kastrup @ 2014-10-21 17:25 UTC (permalink / raw)
To: Linus Torvalds
Cc: Michael J Gruber, Konstantin Ryabitsev, Junio C Hamano,
infra-steering, Git Mailing List
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Tue, Oct 21, 2014 at 1:08 AM, Michael J Gruber
> <git@drmicha.warpmail.net> wrote:
>>
>> Unfortunately, the git archive doc clearly says that the umask is
>> applied to all archive entries. And that clearly wasn't the case (for
>> extended metadata headers) before Brian's fix.
>
> Hey, it's time for another round of the world-famous "Captain Obvious
> Quiz Game"! Yay!
>
> The questions these week are:
>
> (1) "If reality and documentation do not match, where is the bug?"
> (a) Documentation is buggy
> (b) Reality is buggy
>
> (2) "Where would you put the horse in relationship to a horse-drawn carriage?"
> (a) in front
> (b) in the carriage
You are aware that a buggy _is_ a horse-drawn carriage?
--
Captain Facepalm
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-21 8:08 ` Michael J Gruber
2014-10-21 16:25 ` Linus Torvalds
@ 2014-10-21 18:14 ` Junio C Hamano
2014-10-22 9:42 ` Michael J Gruber
1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2014-10-21 18:14 UTC (permalink / raw)
To: Michael J Gruber
Cc: Linus Torvalds, Konstantin Ryabitsev, infra-steering,
Git Mailing List
Michael J Gruber <git@drmicha.warpmail.net> writes:
> Unfortunately, the git archive doc clearly says that the umask is
> applied to all archive entries.
Is an extended pax header "an archive entry"? I doubt it, and the
above is not relevant. The mode bits for the archive entry that it
applies to does not come from there.
See my other message for my final judgement on this one. I wouldn't
have minded if the original used the same umask for those ignored
mode bits, but changing the bits to be ignored after the fact is not
helping any real use case and only hurts existing users.
That is not to say that we cannot later fix bigger issues in the
output. I just do not see that otherwise-unused mode bits in the
extended pax header big enough an issue to spend brain cycles to
carefully lay and execute transition plans to avoid breaking
existing users.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-20 23:44 ` Konstantin Ryabitsev
@ 2014-10-21 18:59 ` Junio C Hamano
0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2014-10-21 18:59 UTC (permalink / raw)
To: Konstantin Ryabitsev; +Cc: Linus Torvalds, infra-steering, Git Mailing List
Konstantin Ryabitsev <konstantin@linuxfoundation.org> writes:
> On 20/10/14 06:28 PM, brian m. carlson wrote:
>>> Junio, quite frankly, I don't think that that fix was a good idea. I'd
>>> > suggest having a *separate* umask for the pax headers, so that we do
>>> > not break this long-lasting stability of "git archive" output in ways
>>> > that are unfixable and not compatible. kernel.org has relied (for a
>>> > *long* time) on being able to just upload the signature of the
>>> > resulting tar-file, because both sides can generate the same tar-fiel
>>> > bit-for-bit.
>> It sounds like kernel.org has a bug, then. Perhaps that's the
>> appropriate place to fix the issue.
>
> It's not a bug, it's a feature (TM). KUP relies on git-archive's ability
> to create identical tar archives across platforms and versions. The
> benefit is that Linus or Greg can create a detached PGP signature
> against a tarball created from "git archive [tag]" on their system, and
> just tell kup to create the same archive remotely, thus saving them the
> trouble of uploading 80Mb each time they cut a release.
>
> With their frequent travel to places where upload bandwidth is both slow
> and unreliable, this ability to not have to upload hundreds of Mbs each
> time they cut a release is very handy and certainly helps keep kernel
> releases on schedule.
>
> So, while it's fair to point out that git-archive was never intended to
> always create bit-for-bit identical outputs, it would be *very nice* if
> this remained in place, as at least one large-ish deployment (us) finds
> it really handy.
While I agree that it is a nice "feature", I wish KUP folks thought
more about what should happen when the archive output _must_ change
when a more serious bug is discovered, and coordinated with us
better.
During a period where older and buggy versions of "git archive" are
used by some uploaders while a new version is used by others, KUP
could:
- avail itself to a version (or versions) of "git archive" so that
it can recreate both older and newer output;
- upon receiving a tarball and signature, try recreating newer
output and see if signature matches, and when the signature does
not match, recreate older output and try again.
And we could supply "git archive --compatible=v1.7" option in the
newer version if that is easier on KUP folks than having to keep
multiple installations of versions of "git archive" around.
While I am on the topic of KUP, one feature I wish, which is the
only thing that is preventing me from updating the preformatted
documentation https://www.kernel.org/pub/software/scm/git/docs/, is
to allow me to upload a single tarball and extract it at one
location (e.g. /pub/software/scm/git/docs/) while removing existing
files in that location (i.e. removing deleted files). Where do I
file such a feature request?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-21 18:14 ` Junio C Hamano
@ 2014-10-22 9:42 ` Michael J Gruber
2014-10-23 1:09 ` brian m. carlson
0 siblings, 1 reply; 18+ messages in thread
From: Michael J Gruber @ 2014-10-22 9:42 UTC (permalink / raw)
To: Junio C Hamano
Cc: Linus Torvalds, Konstantin Ryabitsev, infra-steering,
Git Mailing List
Junio C Hamano schrieb am 21.10.2014 um 20:14:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
>
>> Unfortunately, the git archive doc clearly says that the umask is
>> applied to all archive entries.
>
> Is an extended pax header "an archive entry"? I doubt it, and the
> above is not relevant. The mode bits for the archive entry that it
> applies to does not come from there.
The problem seem to be old tar versions which mis-take the extensions
for archive entries, aren't they?
> See my other message for my final judgement on this one. I wouldn't
> have minded if the original used the same umask for those ignored
> mode bits, but changing the bits to be ignored after the fact is not
> helping any real use case and only hurts existing users.
>
> That is not to say that we cannot later fix bigger issues in the
> output. I just do not see that otherwise-unused mode bits in the
> extended pax header big enough an issue to spend brain cycles to
> carefully lay and execute transition plans to avoid breaking
> existing users.
My question to Brian still stands which existing users he was trying to
cater for with his patch. If there indeed are no existing affected users
besides the KUP users (as you seem to assume) it's a clear case. Pun
intended ;)
As I pointed out (and you cut out), I don't mind doing the revert. I
just want us to do the right things for the right reasons (the ones you
ponted out, Junio).
Michael
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-22 9:42 ` Michael J Gruber
@ 2014-10-23 1:09 ` brian m. carlson
2014-10-26 18:59 ` René Scharfe
0 siblings, 1 reply; 18+ messages in thread
From: brian m. carlson @ 2014-10-23 1:09 UTC (permalink / raw)
To: Michael J Gruber
Cc: Junio C Hamano, Konstantin Ryabitsev, infra-steering,
Git Mailing List
[-- Attachment #1: Type: text/plain, Size: 1948 bytes --]
On Wed, Oct 22, 2014 at 11:42:48AM +0200, Michael J Gruber wrote:
> Junio C Hamano schrieb am 21.10.2014 um 20:14:
> > Michael J Gruber <git@drmicha.warpmail.net> writes:
> >
> >> Unfortunately, the git archive doc clearly says that the umask is
> >> applied to all archive entries.
> >
> > Is an extended pax header "an archive entry"? I doubt it, and the
> > above is not relevant. The mode bits for the archive entry that it
> > applies to does not come from there.
>
> The problem seem to be old tar versions which mis-take the extensions
> for archive entries, aren't they?
Yes. POSIX isn't clear on how unknown entries are to be handled. I've
seen some Windows tar implementations extract GNU longlink extensions as
files, which leads to a lot of pain.
> My question to Brian still stands which existing users he was trying to
> cater for with his patch. If there indeed are no existing affected users
> besides the KUP users (as you seem to assume) it's a clear case. Pun
> intended ;)
The pax format is an extension of the tar format. All of the pax
implementations I've seen on Linux (OpenBSD's and MirBSD's) don't
actually understand the pax headers and emit them as files. 7zip does
as well. I expect there are other Unix systems where tar itself doesn't
understand pax headers, although I don't have access to anything other
than Linux and FreeBSD.
Since it's very common to extract tar archives in /tmp, I didn't want to
leave world-writable files in /tmp (or anywhere else someone might get
to them). While the contents probably aren't sensitive, a malicious
user might fill someone's quota by "helpfully" appending /dev/zero to
the file. And yes, users do these things.
--
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-23 1:09 ` brian m. carlson
@ 2014-10-26 18:59 ` René Scharfe
2014-10-26 21:15 ` brian m. carlson
2014-10-27 20:19 ` Junio C Hamano
0 siblings, 2 replies; 18+ messages in thread
From: René Scharfe @ 2014-10-26 18:59 UTC (permalink / raw)
To: Michael J Gruber, Junio C Hamano, Konstantin Ryabitsev,
infra-steering, Git Mailing List
Am 23.10.2014 um 03:09 schrieb brian m. carlson:
> On Wed, Oct 22, 2014 at 11:42:48AM +0200, Michael J Gruber wrote:
>> Junio C Hamano schrieb am 21.10.2014 um 20:14:
>>> Michael J Gruber <git@drmicha.warpmail.net> writes:
>>>
>>>> Unfortunately, the git archive doc clearly says that the umask is
>>>> applied to all archive entries.
>>>
>>> Is an extended pax header "an archive entry"? I doubt it, and the
>>> above is not relevant. The mode bits for the archive entry that it
>>> applies to does not come from there.
>>
>> The problem seem to be old tar versions which mis-take the extensions
>> for archive entries, aren't they?
>
> Yes. POSIX isn't clear on how unknown entries are to be handled. I've
> seen some Windows tar implementations extract GNU longlink extensions as
> files, which leads to a lot of pain.
That's by design -- extended headers are meant to be extracted as plain
files by implementations that do not understand them.
http://pubs.opengroup.org/onlinepubs/009695399/utilities/pax.html says:
"If a particular implementation does not recognize the type, or the user
does not have appropriate privilege to create that type, the file shall
be extracted as if it were a regular file if the file type is defined to
have a meaning for the size field that could cause data logical records
to be written on the medium [...]."
>> My question to Brian still stands which existing users he was trying to
>> cater for with his patch. If there indeed are no existing affected users
>> besides the KUP users (as you seem to assume) it's a clear case. Pun
>> intended ;)
>
> The pax format is an extension of the tar format. All of the pax
> implementations I've seen on Linux (OpenBSD's and MirBSD's) don't
> actually understand the pax headers and emit them as files. 7zip does
> as well. I expect there are other Unix systems where tar itself doesn't
> understand pax headers, although I don't have access to anything other
> than Linux and FreeBSD.
NetBSD's tar does as well.
It's surprising and sad to see *pax* implementations not supporting pax
extended headers in 2014, though. It seems long file names etc. are not
common enough. Or perhaps pax is simply not used that much.
> Since it's very common to extract tar archives in /tmp, I didn't want to
> leave world-writable files in /tmp (or anywhere else someone might get
> to them). While the contents probably aren't sensitive, a malicious
> user might fill someone's quota by "helpfully" appending /dev/zero to
> the file. And yes, users do these things.
The extracted files are only world-writable if umask & 2 == 0 or if -p
(preserve permissions) has been used, no?
René
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-26 18:59 ` René Scharfe
@ 2014-10-26 21:15 ` brian m. carlson
2014-10-27 20:19 ` Junio C Hamano
1 sibling, 0 replies; 18+ messages in thread
From: brian m. carlson @ 2014-10-26 21:15 UTC (permalink / raw)
To: Git Mailing List
[-- Attachment #1: Type: text/plain, Size: 1655 bytes --]
On Sun, Oct 26, 2014 at 07:59:55PM +0100, René Scharfe wrote:
> Am 23.10.2014 um 03:09 schrieb brian m. carlson:
> >The pax format is an extension of the tar format. All of the pax
> >implementations I've seen on Linux (OpenBSD's and MirBSD's) don't
> >actually understand the pax headers and emit them as files. 7zip does
> >as well. I expect there are other Unix systems where tar itself doesn't
> >understand pax headers, although I don't have access to anything other
> >than Linux and FreeBSD.
>
> NetBSD's tar does as well.
>
> It's surprising and sad to see *pax* implementations not supporting pax
> extended headers in 2014, though. It seems long file names etc. are not
> common enough. Or perhaps pax is simply not used that much.
The original pax utility didn't specify the pax format, only cpio and
ustar. The pax format was first release in POSIX 1003.1-2001.
> >Since it's very common to extract tar archives in /tmp, I didn't want to
> >leave world-writable files in /tmp (or anywhere else someone might get
> >to them). While the contents probably aren't sensitive, a malicious
> >user might fill someone's quota by "helpfully" appending /dev/zero to
> >the file. And yes, users do these things.
>
> The extracted files are only world-writable if umask & 2 == 0 or if -p
> (preserve permissions) has been used, no?
Yes, unless you're the superuser, in which case that's the default.
--
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sources for 3.18-rc1 not uploaded
2014-10-26 18:59 ` René Scharfe
2014-10-26 21:15 ` brian m. carlson
@ 2014-10-27 20:19 ` Junio C Hamano
1 sibling, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2014-10-27 20:19 UTC (permalink / raw)
To: René Scharfe
Cc: brian m. carlson, Michael J Gruber, Konstantin Ryabitsev,
infra-steering, Git Mailing List
René Scharfe <l.s.r@web.de> writes:
> That's by design -- extended headers are meant to be extracted as
> plain files by implementations that do not understand them.
>
> http://pubs.opengroup.org/onlinepubs/009695399/utilities/pax.html
> says: "If a particular implementation does not recognize the type, or
> the user does not have appropriate privilege to create that type, the
> file shall be extracted as if it were a regular file if the file type
> is defined to have a meaning for the size field that could cause data
> logical records to be written on the medium [...]."
Ahh, thanks for digging this up. I knew POSIX said something about
this somewhere when I responded (and that is why I said "even though
I wouldn't have minded if the original implementation were to apply
the same umask for these entries that look like "dummy files" to
them."), but I didn't have patience to read it through myself.
> It's surprising and sad to see *pax* implementations not supporting
> pax extended headers in 2014, though. It seems long file names
> etc. are not common enough. Or perhaps pax is simply not used that
> much.
I would say that if we really want strictness, the _right_ way
forward might be:
- Use tar.paxumask patch from Linus, to allow those who are aware
of and care about the older pax implementations (i.e. Brian), to
optionally tweak umasks applied to those extended header entries,
while keeping the traditional behaviour as the default;
- Warn that the default will change to use tar.paxumask that is the
same as tar.umask in some future version of Git;
- In some future version, flip the default.
Given that it will be a race between us flipping the default and the
affected implementations of extraction tools going extinct, however,
I do not think such a transition would be of high priority.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2014-10-27 20:19 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20141020115943.GA27144@gmail.com>
2014-10-20 15:25 ` Sources for 3.18-rc1 not uploaded Linus Torvalds
2014-10-20 18:28 ` Junio C Hamano
2014-10-20 18:37 ` Konstantin Ryabitsev
2014-10-20 19:43 ` Junio C Hamano
2014-10-20 21:52 ` Greg KH
2014-10-20 22:28 ` brian m. carlson
2014-10-20 23:17 ` Linus Torvalds
2014-10-21 8:08 ` Michael J Gruber
2014-10-21 16:25 ` Linus Torvalds
2014-10-21 17:25 ` David Kastrup
2014-10-21 18:14 ` Junio C Hamano
2014-10-22 9:42 ` Michael J Gruber
2014-10-23 1:09 ` brian m. carlson
2014-10-26 18:59 ` René Scharfe
2014-10-26 21:15 ` brian m. carlson
2014-10-27 20:19 ` Junio C Hamano
2014-10-20 23:44 ` Konstantin Ryabitsev
2014-10-21 18:59 ` Junio C Hamano
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.