From: "Theodore Ts'o" <tytso@mit.edu>
To: demerphq <demerphq@gmail.com>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Michal Suchánek" <msuchanek@suse.de>,
"brian m. carlson" <sandals@crustytoothpaste.net>,
"Konstantin Ryabitsev" <konstantin@linuxfoundation.org>,
"Eli Schwartz" <eschwartz93@gmail.com>,
"Git List" <git@vger.kernel.org>
Subject: Re: Stability of git-archive, breaking (?) the Github universe, and a possible solution
Date: Wed, 1 Feb 2023 13:56:27 -0500 [thread overview]
Message-ID: <Y9q129WbseimgeBS@mit.edu> (raw)
In-Reply-To: <CANgJU+VNY-VziRijSwyb1WF9s31hKroK+2VJ0qEGiYweiA59Ug@mail.gmail.com>
If the goal is stable tar.gz files, Debian has a very nice soution
called pristine-tar[1]. This you to store a tar.gz image which in a
very efficient way, by leveraging the objects in the git repository.
[1] https://manpages.debian.org/unstable/pristine-tar/pristine-tar.1.en.html
The data is stored on the pristine-tar branch, and is quite efficient:
% git show --stat pristine-tar
commit 56dded989c9e0c852b8af9ae72ffe94270bfd34a (origin/pristine-tar, github/pristine-tar, pristine-tar)
Author: Theodore Ts'o <tytso@mit.edu>
Date: Thu Dec 30 01:06:13 2021 -0500
pristine-tar data for e2fsprogs_1.46.5.orig.tar.gz
e2fsprogs_1.46.5.orig.tar.gz.asc | 11 +++++++++++
e2fsprogs_1.46.5.orig.tar.gz.delta | Bin 0 -> 59034 bytes
e2fsprogs_1.46.5.orig.tar.gz.id | 1 +
3 files changed, 12 insertions(+)
And this allows me to reproduce the original tar.gz file, along with a
GPG signature file, which is about 9 megabytes. The *.id file
contains the git commit from which the tar file was generated, and
this is what allows the *.delta file to be as small as it is.
% pristine-tar checkout e2fsprogs_1.46.5.orig.tar.gz -s e2fsprogs_1.46.5.orig.tar.gz.asc
pristine-tar: successfully generated e2fsprogs_1.46.5.orig.tar.gz
pristine-tar: successfully generated e2fsprogs_1.46.5.orig.tar.gz.asc
% ls -sh e2fsprogs_1.46.5.orig.tar.gz*
9.1M e2fsprogs_1.46.5.orig.tar.gz 4.0K e2fsprogs_1.46.5.orig.tar.gz.asc
% gpg e2fsprogs_1.46.5.orig.tar.gz.asc
gpg: WARNING: no command supplied. Trying to guess what you mean ...
gpg: assuming signed data in 'e2fsprogs_1.46.5.orig.tar.gz'
gpg: Signature made Thu 30 Dec 2021 01:02:52 AM EST
gpg: using RSA key 2B69B954DBFE0879288137C9F2F95956950D81A3
gpg: Good signature from "Theodore Ts'o <tytso@mit.edu>" [ultimate]
gpg: aka "Theodore Ts'o <tytso@debian.org>" [ultimate]
gpg: aka "Theodore Ts'o <tytso@google.com>" [ultimate]
Primary key fingerprint: 3AB0 57B7 E78D 945C 8C55 91FB D36F 769B C118 04F0
Subkey fingerprint: 2B69 B954 DBFE 0879 2881 37C9 F2F9 5956 950D 81A3
This is currently a Debian special, and while its functionality was
designed to work well with Debian packaging workflows, but it's a
general tool that could be used in multiple contexts, not just for
Debian packaging.
If I recall correctly, pristine-tar is currently in maintenance mode,
and I suspect if someone was interested in investing time into making
pristine-tar more portable to other OS's, including MacOS and Windows,
and maybe potentially even integrating into git directly, the current
maintainer of pristine-tar might be quite happy to let other people
give the code more TLC.
- Ted
next prev parent reply other threads:[~2023-02-01 18:56 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-31 0:06 Stability of git-archive, breaking (?) the Github universe, and a possible solution Eli Schwartz
2023-01-31 7:49 ` Ævar Arnfjörð Bjarmason
2023-01-31 9:11 ` Eli Schwartz
2023-02-02 9:32 ` [PATCH 0/9] git archive: use gzip again by default, document output stabilty Ævar Arnfjörð Bjarmason
2023-02-02 9:32 ` [PATCH 1/9] archive & tar config docs: de-duplicate configuration section Ævar Arnfjörð Bjarmason
2023-02-02 9:32 ` [PATCH 2/9] git config docs: document "tar.<format>.{command,remote}" Ævar Arnfjörð Bjarmason
2023-02-02 9:32 ` [PATCH 3/9] archiver API: make the "flags" in "struct archiver" an enum Ævar Arnfjörð Bjarmason
2023-02-02 9:32 ` [PATCH 4/9] archive: omit the shell for built-in "command" filters Ævar Arnfjörð Bjarmason
2023-02-02 9:32 ` [PATCH 5/9] archive-tar.c: move internal gzip implementation to a function Ævar Arnfjörð Bjarmason
2023-02-02 9:32 ` [PATCH 6/9] archive: use "gzip -cn" for stability, not "git archive gzip" Ævar Arnfjörð Bjarmason
2023-02-02 9:32 ` [PATCH 7/9] test-lib.sh: add a lazy GZIP prerequisite Ævar Arnfjörð Bjarmason
2023-02-02 9:32 ` [PATCH 8/9] archive tests: test for "gzip -cn" and "git archive gzip" stability Ævar Arnfjörð Bjarmason
2023-02-02 9:32 ` [PATCH 9/9] git archive docs: document output non-stability Ævar Arnfjörð Bjarmason
2023-02-02 10:25 ` brian m. carlson
2023-02-02 10:30 ` Ævar Arnfjörð Bjarmason
2023-02-02 16:34 ` Junio C Hamano
2023-02-04 17:46 ` brian m. carlson
2023-02-02 16:17 ` [PATCH 0/9] git archive: use gzip again by default, document output stabilty Phillip Wood
2023-02-02 16:40 ` Junio C Hamano
2023-02-02 19:23 ` Raymond E. Pasco
2023-02-03 8:06 ` [PATCH] archive: document output stability concerns Raymond E. Pasco
2023-02-03 13:49 ` [PATCH 0/9] git archive: use gzip again by default, document output stabilty Ævar Arnfjörð Bjarmason
2023-02-06 14:46 ` Phillip Wood
2023-02-03 15:47 ` Theodore Ts'o
2023-02-02 16:25 ` Junio C Hamano
2023-02-04 18:08 ` René Scharfe
2023-02-05 21:30 ` Ævar Arnfjörð Bjarmason
2023-02-12 17:41 ` René Scharfe
2023-01-31 9:54 ` Stability of git-archive, breaking (?) the Github universe, and a possible solution brian m. carlson
2023-01-31 11:31 ` Ævar Arnfjörð Bjarmason
2023-01-31 15:05 ` Konstantin Ryabitsev
2023-01-31 22:32 ` brian m. carlson
2023-02-01 9:40 ` Ævar Arnfjörð Bjarmason
2023-02-01 11:34 ` demerphq
2023-02-01 12:21 ` Michal Suchánek
2023-02-01 12:48 ` demerphq
2023-02-01 13:43 ` Ævar Arnfjörð Bjarmason
2023-02-01 15:21 ` demerphq
2023-02-01 18:56 ` Theodore Ts'o [this message]
2023-02-02 21:19 ` Joey Hess
2023-02-03 4:02 ` Theodore Ts'o
2023-02-03 13:32 ` Ævar Arnfjörð Bjarmason
2023-02-01 12:17 ` Raymond E. Pasco
2023-02-01 23:16 ` brian m. carlson
2023-02-01 23:37 ` Junio C Hamano
2023-02-02 23:01 ` brian m. carlson
2023-02-02 23:47 ` rsbecker
2023-02-03 13:18 ` Ævar Arnfjörð Bjarmason
2023-02-02 0:42 ` Ævar Arnfjörð Bjarmason
2023-01-31 15:56 ` Eli Schwartz
2023-01-31 16:20 ` Konstantin Ryabitsev
2023-01-31 16:34 ` Eli Schwartz
2023-01-31 20:34 ` Konstantin Ryabitsev
2023-01-31 20:45 ` Michal Suchánek
2023-02-01 1:33 ` brian m. carlson
2023-02-01 12:42 ` Ævar Arnfjörð Bjarmason
2023-02-01 23:18 ` brian m. carlson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y9q129WbseimgeBS@mit.edu \
--to=tytso@mit.edu \
--cc=avarab@gmail.com \
--cc=demerphq@gmail.com \
--cc=eschwartz93@gmail.com \
--cc=git@vger.kernel.org \
--cc=konstantin@linuxfoundation.org \
--cc=msuchanek@suse.de \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.