From: Jeff King <peff@peff.net>
To: Elijah Newren <newren@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
Lars Schneider <larsxschneider@gmail.com>,
"brian m. carlson" <sandals@crustytoothpaste.net>,
Taylor Blau <me@ttaylorr.com>,
Jonathan Nieder <jrnieder@gmail.com>
Subject: Re: [PATCH 09/10] fast-export: add a --show-original-ids option to show original names
Date: Mon, 12 Nov 2018 07:53:41 -0500 [thread overview]
Message-ID: <20181112125341.GH3956@sigill.intra.peff.net> (raw)
In-Reply-To: <CABPp-BGNt0FcqiT=OqctjOEvY9ewNUJZ-Rs_aVEihjbQt3K8tQ@mail.gmail.com>
On Sun, Nov 11, 2018 at 12:32:22AM -0800, Elijah Newren wrote:
> > > Documentation/git-fast-export.txt | 7 +++++++
> > > builtin/fast-export.c | 20 +++++++++++++++-----
> > > fast-import.c | 17 +++++++++++++++++
> > > t/t9350-fast-export.sh | 17 +++++++++++++++++
> > > 4 files changed, 56 insertions(+), 5 deletions(-)
> >
> > The fast-import format is documented in Documentation/git-fast-import.txt.
> > It might need an update to cover the new format.
>
> We document the format in both fast-import.c and
> Documentation/git-fast-import.txt? Maybe we should delete the long
> comments in fast-import.c so this isn't duplicated?
Yes, that is probably worth doing (see the comment at the top of
fast-import.c). Some information might need to be migrated.
If we're going to have just one spot, I think it needs to be the
user-facing documentation. This is a public interface that other people
are building compatible implementations for (including your new tool).
> > > +--show-original-ids::
> > > + Add an extra directive to the output for commits and blobs,
> > > + `originally <SHA1SUM>`. While such directives will likely be
> > > + ignored by importers such as git-fast-import, it may be useful
> > > + for intermediary filters (e.g. for rewriting commit messages
> > > + which refer to older commits, or for stripping blobs by id).
> >
> > I'm not quite sure how a blob ends up being rewritten by fast-export (I
> > get that commits may change due to dropping parents).
>
> It doesn't get rewritten by fast-export; it gets rewritten by other
> intermediary filters, e.g. in something like this:
>
> git fast-export --show-original-ids --all | intermediary_filter |
> git fast-import
>
> The intermediary_filter program may want to strip out blobs by id, or
> remove filemodify and filedelete directives unless they touch certain
> paths, etc.
OK, that matches my understanding. So why does fast-export need to print
the blob ids? If the intermediary is rewriting blobs, it can then
produce the "originally" line itself, can't it?
The more interesting case I guess is your "strip out blobs by id"
example. There the intermediary _could_ do so itself, but it would
require recomputing the object id of each blob.
If you use "--no-data", then this just works (we specify tree entries by
object id, rather than by mark). But I can see how it would be useful to
have the information even without "--no-data" (i.e., if you are doing
multiple kinds of rewrites on a single stream).
I think the thing that confused me is that this "originally" is doing
two things:
- mentioning blob ids as an optimization / convenience for the reader
- mentioning rewritten commit (and presumably tag?) ids that were
rewritten as part of a partial history export. I suppose even trees
could be rewritten that way, too, but fast-import doesn't generally
consider trees to be a first-class item.
So I'm OK with it, but I wonder if there is an easier way to explain it.
-Peff
next prev parent reply other threads:[~2018-11-12 12:53 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-23 13:04 Import/Export as a fast way to purge files from Git? Lars Schneider
2018-09-23 14:55 ` Eric Sunshine
2018-09-23 15:58 ` Lars Schneider
2018-09-23 15:53 ` brian m. carlson
2018-09-23 17:04 ` Jeff King
2018-09-24 17:24 ` Elijah Newren
2018-10-31 19:15 ` Lars Schneider
2018-11-01 7:12 ` Elijah Newren
2018-11-11 6:23 ` [PATCH 00/10] fast export and import fixes and features Elijah Newren
2018-11-11 6:23 ` [PATCH 01/10] git-fast-import.txt: fix documentation for --quiet option Elijah Newren
2018-11-11 6:33 ` Jeff King
2018-11-11 6:23 ` [PATCH 02/10] git-fast-export.txt: clarify misleading documentation about rev-list args Elijah Newren
2018-11-11 6:36 ` Jeff King
2018-11-11 7:17 ` Elijah Newren
2018-11-13 23:25 ` Elijah Newren
2018-11-13 23:39 ` Jonathan Nieder
2018-11-14 0:02 ` Elijah Newren
2018-11-11 6:23 ` [PATCH 03/10] fast-export: use value from correct enum Elijah Newren
2018-11-11 6:36 ` Jeff King
2018-11-11 20:10 ` Ævar Arnfjörð Bjarmason
2018-11-12 9:12 ` Ævar Arnfjörð Bjarmason
2018-11-12 11:31 ` Jeff King
2018-11-11 6:23 ` [PATCH 04/10] fast-export: avoid dying when filtering by paths and old tags exist Elijah Newren
2018-11-11 6:44 ` Jeff King
2018-11-11 7:38 ` Elijah Newren
2018-11-12 12:32 ` Jeff King
2018-11-12 22:50 ` brian m. carlson
2018-11-13 14:38 ` Jeff King
2018-11-11 6:23 ` [PATCH 05/10] fast-export: move commit rewriting logic into a function for reuse Elijah Newren
2018-11-11 6:47 ` Jeff King
2018-11-11 6:23 ` [PATCH 06/10] fast-export: when using paths, avoid corrupt stream with non-existent mark Elijah Newren
2018-11-11 6:53 ` Jeff King
2018-11-11 8:01 ` Elijah Newren
2018-11-12 12:45 ` Jeff King
2018-11-12 15:36 ` Elijah Newren
2018-11-11 6:23 ` [PATCH 07/10] fast-export: ensure we export requested refs Elijah Newren
2018-11-11 7:02 ` Jeff King
2018-11-11 8:20 ` Elijah Newren
2018-11-11 6:23 ` [PATCH 08/10] fast-export: add --reference-excluded-parents option Elijah Newren
2018-11-11 7:11 ` Jeff King
2018-11-11 6:23 ` [PATCH 09/10] fast-export: add a --show-original-ids option to show original names Elijah Newren
2018-11-11 7:20 ` Jeff King
2018-11-11 8:32 ` Elijah Newren
2018-11-12 12:53 ` Jeff King [this message]
2018-11-12 15:46 ` Elijah Newren
2018-11-12 16:31 ` Jeff King
2018-11-11 6:23 ` [PATCH 10/10] fast-export: add --always-show-modify-after-rename Elijah Newren
2018-11-11 7:23 ` Jeff King
2018-11-11 8:42 ` Elijah Newren
2018-11-12 12:58 ` Jeff King
2018-11-12 18:08 ` Elijah Newren
2018-11-13 14:45 ` Jeff King
2018-11-13 17:10 ` Elijah Newren
2018-11-14 7:14 ` Jeff King
2018-11-11 7:27 ` [PATCH 00/10] fast export and import fixes and features Jeff King
2018-11-11 8:44 ` Elijah Newren
2018-11-12 13:00 ` Jeff King
2018-11-14 0:25 ` [PATCH v2 00/11] " Elijah Newren
2018-11-14 0:25 ` [PATCH v2 01/11] git-fast-import.txt: fix documentation for --quiet option Elijah Newren
2018-11-14 0:25 ` [PATCH v2 02/11] git-fast-export.txt: clarify misleading documentation about rev-list args Elijah Newren
2018-11-14 0:25 ` [PATCH v2 03/11] fast-export: use value from correct enum Elijah Newren
2018-11-14 0:25 ` [PATCH v2 04/11] fast-export: avoid dying when filtering by paths and old tags exist Elijah Newren
2018-11-14 19:17 ` SZEDER Gábor
2018-11-14 23:13 ` Elijah Newren
2018-11-14 0:25 ` [PATCH v2 05/11] fast-export: move commit rewriting logic into a function for reuse Elijah Newren
2018-11-14 0:25 ` [PATCH v2 06/11] fast-export: when using paths, avoid corrupt stream with non-existent mark Elijah Newren
2018-11-14 0:25 ` [PATCH v2 07/11] fast-export: ensure we export requested refs Elijah Newren
2018-11-14 0:25 ` [PATCH v2 08/11] fast-export: add --reference-excluded-parents option Elijah Newren
2018-11-14 19:27 ` SZEDER Gábor
2018-11-14 23:16 ` Elijah Newren
2018-11-14 0:25 ` [PATCH v2 09/11] fast-import: remove unmaintained duplicate documentation Elijah Newren
2018-11-14 0:25 ` [PATCH v2 10/11] fast-export: add a --show-original-ids option to show original names Elijah Newren
2018-11-14 0:26 ` [PATCH v2 11/11] fast-export: add --always-show-modify-after-rename Elijah Newren
2018-11-14 7:25 ` [PATCH v2 00/11] fast export and import fixes and features Jeff King
2018-11-16 7:59 ` [PATCH v3 " Elijah Newren
2018-11-16 7:59 ` [PATCH v3 01/11] fast-export: convert sha1 to oid Elijah Newren
2018-11-16 7:59 ` [PATCH v3 02/11] git-fast-import.txt: fix documentation for --quiet option Elijah Newren
2018-11-16 7:59 ` [PATCH v3 03/11] git-fast-export.txt: clarify misleading documentation about rev-list args Elijah Newren
2018-11-16 7:59 ` [PATCH v3 04/11] fast-export: use value from correct enum Elijah Newren
2018-11-16 7:59 ` [PATCH v3 05/11] fast-export: avoid dying when filtering by paths and old tags exist Elijah Newren
2018-11-16 7:59 ` [PATCH v3 06/11] fast-export: move commit rewriting logic into a function for reuse Elijah Newren
2018-11-16 7:59 ` [PATCH v3 07/11] fast-export: when using paths, avoid corrupt stream with non-existent mark Elijah Newren
2018-11-16 7:59 ` [PATCH v3 08/11] fast-export: ensure we export requested refs Elijah Newren
2018-11-16 7:59 ` [PATCH v3 09/11] fast-export: add --reference-excluded-parents option Elijah Newren
2018-11-16 7:59 ` [PATCH v3 10/11] fast-import: remove unmaintained duplicate documentation Elijah Newren
2018-11-16 7:59 ` [PATCH v3 11/11] fast-export: add a --show-original-ids option to show original names Elijah Newren
2018-11-16 12:29 ` SZEDER Gábor
2018-11-16 8:50 ` [PATCH v3 00/11] fast export and import fixes and features Jeff King
2018-11-12 9:17 ` Import/Export as a fast way to purge files from Git? Ævar Arnfjörð Bjarmason
2018-11-12 15:34 ` Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181112125341.GH3956@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=jrnieder@gmail.com \
--cc=larsxschneider@gmail.com \
--cc=me@ttaylorr.com \
--cc=newren@gmail.com \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).