From: Jonathan Nieder <jrnieder@gmail.com>
To: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Git List <git@vger.kernel.org>,
Peter Baumann <waste.manager@gmx.de>,
David Barr <david.barr@cordelta.com>,
Sverre Rabbelier <srabbelier@gmail.com>,
Erik Faye-Lund <kusmabite@gmail.com>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 2/5] fast-export: Introduce --inline-blobs
Date: Tue, 29 Mar 2011 15:44:12 -0500 [thread overview]
Message-ID: <20110329204412.GA13812@elie> (raw)
In-Reply-To: <1301422392-21177-3-git-send-email-artagnon@gmail.com>
Hi,
Ramkumar Ramachandra wrote:
> Introduce a new command-line option --inline-blobs that always inlines
> blobs instead of referring to them via marks or their original SHA-1
> hash.
Could you expand on this? What does it mean to inline a blob (is it the
same thing as the reference manual describes as using the inline data
format with the filemodify command)? Why do we want to always do
that? How would a person or script choose whether to use this option?
Are there any downsides?
I ask because I would be happy to use something like this. ;-) Thanks
for working on it.
> --- a/Documentation/git-fast-export.txt
> +++ b/Documentation/git-fast-export.txt
> @@ -90,6 +90,11 @@ marks the same across runs.
> resulting stream can only be used by a repository which
> already contains the necessary objects.
>
> +--inline-blobs::
> + Inline all blobs, instead of referring to them via marks or
> + their original SHA-1 hash. This is useful to parsers, as they
> + don't need to persist blobs.
This explanation leaves something out, I think. If it is useful to
parsers, that means it simplifies the syntax, so a natural conclusion
would be that parsers do not need to learn about the non-inline
syntax. But I think the last time[1] this came up, we decided that
one should not encourage that, because it moves away from a world in
which "git fast-export", "hg fast-export", and svn-fe use one standard
format and can be used interchangeably.
[1] http://thread.gmane.org/gmane.comp.version-control.git/165237/focus=165289
Perhaps it would be better to say something to the effect that "This
can decrease the memory footprint and complexity of the work some
fast-import backends have to do"? In other words, it's just an
optimization.
To that end, if the same blob keeps on coming up again and again, I'd
be tempted to allow making a mark for it in the future, even when
--inline-blobs is specified. In other words, I'd prefer (unless
real-world considerations prevent it) for --inline-blobs to be a hint
or a permission instead of something binding.
> --- a/builtin/fast-export.c
> +++ b/builtin/fast-export.c
[...]
> @@ -118,7 +119,7 @@ static void handle_object(const unsigned char *sha1)
> char *buf;
> struct object *object;
>
> - if (no_data)
> + if (no_data || inline_blobs)
> return;
Maybe inline_blobs should imply no_data internally?
>
> if (is_null_sha1(sha1))
> @@ -218,7 +219,23 @@ static void show_filemodify(struct diff_queue_struct *q,
> if (no_data || S_ISGITLINK(spec->mode))
> printf("M %06o %s %s\n", spec->mode,
> sha1_to_hex(spec->sha1), spec->path);
> - else {
> + else if (inline_blobs) {
If so, this could be something like
int inline_me = inline_blobs && !S_ISGITLINK(spec->mode);
...
if (no_data || S_ISGITLINK(spec->mode)) {
const char *dataref =
inline_me ? "inline" : sha1_to_hex(spec->sha1);
printf("M %06o %s %s\n", spec->mode, dataref, spec->path);
} else {
struct object *object = lookup_object(spec->sha1);
printf("M %06o :%d %s\n", spec->mode,
get_object_mark(object), spec->path);
}
if (inline_blob && export_data(spec->data, spec->size))
die_errno("Could not write blob '%s'",
sha1_to_hex(spec->sha1));
> --- a/contrib/svn-fe/.gitignore
> +++ b/contrib/svn-fe/.gitignore
[...]
> --- a/contrib/svn-fe/Makefile
> +++ b/contrib/svn-fe/Makefile
[...]
> --- /dev/null
> +++ b/contrib/svn-fe/svn-fi.c
[...]
> --- /dev/null
> +++ b/contrib/svn-fe/svn-fi.txt
Snuck in from another patch?
Hope that helps,
Jonathan
next prev parent reply other threads:[~2011-03-29 20:44 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-29 18:13 [PATCH 0/5] Towards a Git to SVN bridge Ramkumar Ramachandra
2011-03-29 18:13 ` [PATCH 1/5] date: Expose the time_to_tm function Ramkumar Ramachandra
2011-03-29 18:13 ` [PATCH 2/5] fast-export: Introduce --inline-blobs Ramkumar Ramachandra
2011-03-29 20:44 ` Jonathan Nieder [this message]
2011-03-29 18:13 ` [PATCH 3/5] strbuf: Introduce strbuf_fwrite corresponding to strbuf_fread Ramkumar Ramachandra
2011-03-31 2:15 ` Jonathan Nieder
2011-03-29 18:13 ` [PATCH 4/5] vcs-svn: Introduce svnload, a dumpfile producer Ramkumar Ramachandra
2011-03-29 18:13 ` [PATCH 5/5] t9012-svn-fi: Add tests for svn-fi Ramkumar Ramachandra
2011-03-31 23:23 ` Jonathan Nieder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110329204412.GA13812@elie \
--to=jrnieder@gmail.com \
--cc=artagnon@gmail.com \
--cc=david.barr@cordelta.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=kusmabite@gmail.com \
--cc=srabbelier@gmail.com \
--cc=waste.manager@gmx.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).