git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: phillip.wood@dunelm.org.uk
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Eli Schwartz" <eschwartz93@gmail.com>,
	"René Scharfe" <l.s.r@web.de>,
	"brian m . carlson" <sandals@crustytoothpaste.net>,
	"Konstantin Ryabitsev" <konstantin@linuxfoundation.org>,
	"Michal Suchánek" <msuchanek@suse.de>,
	"Raymond E . Pasco" <ray@ameretat.dev>,
	demerphq <demerphq@gmail.com>
Subject: Re: [PATCH 0/9] git archive: use gzip again by default, document output stabilty
Date: Fri, 3 Feb 2023 10:47:44 -0500	[thread overview]
Message-ID: <Y90soPW6KRB7PQCY@mit.edu> (raw)
In-Reply-To: <771a98ca-9540-ad4e-dfba-9d304e1dff09@dunelm.org.uk>

On Thu, Feb 02, 2023 at 04:17:09PM +0000, Phillip Wood wrote:
> Playing devil's advocate for a moment as we're not going to promise that the
> compressed output of "git archive" will be stable in the future perhaps we
> should use this breakage as an opportunity to highlight that to users and to
> advertize the config setting that allows them to use gzip for compressing
> archives. Reverting the change gives the misleading impression that we're
> making a commitment to keeping the output stable. The focus of this thread
> seems to be the problems relating to github which they have already
> addressed.
> 
> I think there is general agreement that it is not practical to promise that
> the compressed output of "git archive" is stable so maybe it is better to
> make that clear now while users can work around it in the short term with a
> config setting rather than waiting until we're faced with some security or
> other issue that forces a change to the output which users cannot work
> around so easily.

I would be in favor of adding a config option that allows using the
internal gzip option, although leave the default to be keep things
compatible.

The reason for that it should be easy for a forge provider such as
GitHub to break things, deliberately.  Sound insane?  Hear me out.

At $WORK, we have a highly reliable system, Paxos.  It is a highly
fault-tolerant system, so it rarely fails.  But "rarely fails" is not
the same as "never fails".  And hopefully, things should degrade
gracefully if there is a Paxos outage.  But as the Google SRE's are
fond of saying, "Hope is not a strategy".

So periodically, the people who run the Paxos service will
deliberately force downtime for a short amount of time.  The fact that
they will do this is well advertised, and scheduled ahead of time ---
and teams responsible for user-facing services are supposed to make
sure that end-users don't notice when this happens.  Maybe they won't
be able to update configurations as easily while Paxos is down, but it
shouldn't cause a user-visible outage.

So what I would recommend to the GitHub product manager, is that once
a quarter, on a well-advertised date, that they flip the switch and
break the git archive checksums for say, an hour.  Then next quarter,
they advertise that the switch will be thrown for 2 hours, doubling
each time, until it is ramped up to 16 hours.

This will provide the necessary nudge so that all of these badly
designed systems that depend on downloaded archives of arbitrary git
hubs to be stable will rethink their position, while minimizing the
end-user customer impact.  Otherwise, I predict that Bazel, homebrew,
etc will consider to rely on this ill-considered assumption, and at
some point in the future, when we *do* have a much better reason to
want to make a change to the tar or compression algorithm, all of
these end users will once again scream bloody murder.

Of course, this is going to be up to each forge provider to decide
whether they want to do this.  But we can make it easy for them to do
this thing, and I'd argue it is in our interest to make it easy for
them to do this.  Otherwise we'll get constrained in the future by the
fear of massive user blowback, no metter what we say in our
documentation regarding "no promises --- and next time, we really
mean it!"

	      	       	       	    	  - Ted

  parent reply	other threads:[~2023-02-03 15:48 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-31  0:06 Stability of git-archive, breaking (?) the Github universe, and a possible solution Eli Schwartz
2023-01-31  7:49 ` Ævar Arnfjörð Bjarmason
2023-01-31  9:11   ` Eli Schwartz
2023-02-02  9:32   ` [PATCH 0/9] git archive: use gzip again by default, document output stabilty Ævar Arnfjörð Bjarmason
2023-02-02  9:32     ` [PATCH 1/9] archive & tar config docs: de-duplicate configuration section Ævar Arnfjörð Bjarmason
2023-02-02  9:32     ` [PATCH 2/9] git config docs: document "tar.<format>.{command,remote}" Ævar Arnfjörð Bjarmason
2023-02-02  9:32     ` [PATCH 3/9] archiver API: make the "flags" in "struct archiver" an enum Ævar Arnfjörð Bjarmason
2023-02-02  9:32     ` [PATCH 4/9] archive: omit the shell for built-in "command" filters Ævar Arnfjörð Bjarmason
2023-02-02  9:32     ` [PATCH 5/9] archive-tar.c: move internal gzip implementation to a function Ævar Arnfjörð Bjarmason
2023-02-02  9:32     ` [PATCH 6/9] archive: use "gzip -cn" for stability, not "git archive gzip" Ævar Arnfjörð Bjarmason
2023-02-02  9:32     ` [PATCH 7/9] test-lib.sh: add a lazy GZIP prerequisite Ævar Arnfjörð Bjarmason
2023-02-02  9:32     ` [PATCH 8/9] archive tests: test for "gzip -cn" and "git archive gzip" stability Ævar Arnfjörð Bjarmason
2023-02-02  9:32     ` [PATCH 9/9] git archive docs: document output non-stability Ævar Arnfjörð Bjarmason
2023-02-02 10:25       ` brian m. carlson
2023-02-02 10:30         ` Ævar Arnfjörð Bjarmason
2023-02-02 16:34         ` Junio C Hamano
2023-02-04 17:46           ` brian m. carlson
2023-02-02 16:17     ` [PATCH 0/9] git archive: use gzip again by default, document output stabilty Phillip Wood
2023-02-02 16:40       ` Junio C Hamano
2023-02-03 13:49       ` Ævar Arnfjörð Bjarmason
2023-02-06 14:46         ` Phillip Wood
2023-02-03 15:47       ` Theodore Ts'o [this message]
2023-02-02 16:25     ` Junio C Hamano
2023-02-04 18:08       ` René Scharfe
2023-02-05 21:30         ` Ævar Arnfjörð Bjarmason
2023-02-12 17:41           ` René Scharfe
2023-02-02 19:23     ` Raymond E. Pasco
2023-02-03  8:06       ` [PATCH] archive: document output stability concerns Raymond E. Pasco
2023-01-31  9:54 ` Stability of git-archive, breaking (?) the Github universe, and a possible solution brian m. carlson
2023-01-31 11:31   ` Ævar Arnfjörð Bjarmason
2023-01-31 15:05   ` Konstantin Ryabitsev
2023-01-31 22:32     ` brian m. carlson
2023-02-01  9:40       ` Ævar Arnfjörð Bjarmason
2023-02-01 11:34         ` demerphq
2023-02-01 12:21           ` Michal Suchánek
2023-02-01 12:48             ` demerphq
2023-02-01 13:43               ` Ævar Arnfjörð Bjarmason
2023-02-01 15:21                 ` demerphq
2023-02-01 18:56                   ` Theodore Ts'o
2023-02-02 21:19                     ` Joey Hess
2023-02-03  4:02                       ` Theodore Ts'o
2023-02-03 13:32                         ` Ævar Arnfjörð Bjarmason
2023-02-01 23:16         ` brian m. carlson
2023-02-01 23:37           ` Junio C Hamano
2023-02-02 23:01             ` brian m. carlson
2023-02-02 23:47               ` rsbecker
2023-02-03 13:18                 ` Ævar Arnfjörð Bjarmason
2023-02-02  0:42           ` Ævar Arnfjörð Bjarmason
2023-02-01 12:17       ` Raymond E. Pasco
2023-01-31 15:56   ` Eli Schwartz
2023-01-31 16:20     ` Konstantin Ryabitsev
2023-01-31 16:34       ` Eli Schwartz
2023-01-31 20:34         ` Konstantin Ryabitsev
2023-01-31 20:45         ` Michal Suchánek
2023-02-01  1:33     ` brian m. carlson
2023-02-01 12:42   ` Ævar Arnfjörð Bjarmason
2023-02-01 23:18     ` brian m. carlson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y90soPW6KRB7PQCY@mit.edu \
    --to=tytso@mit.edu \
    --cc=avarab@gmail.com \
    --cc=demerphq@gmail.com \
    --cc=eschwartz93@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=konstantin@linuxfoundation.org \
    --cc=l.s.r@web.de \
    --cc=msuchanek@suse.de \
    --cc=phillip.wood@dunelm.org.uk \
    --cc=ray@ameretat.dev \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).