From: Jonathan Nieder <jrnieder@gmail.com>
To: vcs-fast-import-devs@lists.launchpad.net
Cc: David Barr <david.barr@cordelta.com>,
Git Mailing List <git@vger.kernel.org>,
Ramkumar Ramachandra <artagnon@gmail.com>,
Sverre Rabbelier <srabbelier@gmail.com>,
"Shawn O. Pearce" <spearce@spearce.org>,
Tomas Carnecky <tom@dbservice.com>, Sam Vilain <sam@vilain.net>
Subject: [RFC] fast-import: 'cat-blob' and 'ls' commands
Date: Wed, 26 Jan 2011 15:39:22 -0600 [thread overview]
Message-ID: <20110126213922.GA19727@burratino> (raw)
In-Reply-To: <20110103080130.GA8842@burratino>
Hi fast importers,
I would like your thoughts on a few developments in fast-import
protocol (thanks to David, Ram, Sverre, Tomas, and Sam for work so
far). If they seem good, I'd be happy to help make patches to other
backends so these can be implemented widely.
Contents: cat-blob command, filemodify (M) with trees, ls command.
cat-blob command
----------------
fast-import 1.7.4-rc0 added a new "cat-blob" feature. It is meant to
allow exporters that receiving changes in delta form to avoid having
to remember the full text of blobs already exported or re-retrieve
them from the source repository.
It works like this:
1. Out of band, the fast-import frontend and backend negotiate a
channel for the backend to send replies to the frontend. In
git fast-import, this is a file descriptor, defaulting to
stdout. So you can do:
mkfifo replies &&
$frontend <replies |
git fast-import --cat-blob-fd=3 3>replies
The intent is that stdin would typically be a socket and this file
descriptor would point to that.
2. The frontend (optionally) declares use of this feature by putting
feature cat-blob
at the beginning of the stream.
3. When the frontend needs a previously exported blob to use as delta
preimage, it uses the cat-blob command.
cat-blob :3
The backend replies with something like
7c8987a987ca98c blob 6
hello
More precisely, the output format is
<dataref> SP 'blob' SP <length> LF
<full text of blob> LF
The <dataref> can be any text not including whitespace.
The frontend can rely on a little buffering if it wants to print a
command after the "cat-blob", but it must read the reply in its
entirety if it expects the backend to act on later commands. In
other words, the cat-blob command is not guaranteed to be
asynchronous.
This protocol is used by the svn-fe[1] tool to handle Subversion dump
files in version 3 (--deltas) format and seems to work ok.
Does this look sane or does it need tweaking or more detailed
specification to be widely useful? Even once git 1.7.4 is out, it
should be possible to make improvements using a new "feature" name.
filemodify (M) with trees
-------------------------
fast-import 1.7.3-rc0 introduced the ability for a filemodify (M)
command to place a tree named by mark or other <dataref> at a given
path, replacing whatever was there before. The implementation had
some kinks, which fast-import 1.7.4-rc0 ironed out.
Without some way to specify marks or learn tree names out of band, it
is not very useful. With some way to learn tree names, it can be
used, for example, to rewrite revision metadata while reusing the old
tree data:
commit refs/heads/master
mark :11
committer A U Thor <author@example.com> Wed, 26 Jan 2011 15:14:11 -0600
data <<EOF
New change description
EOF
M 040000 4b825dc642cb6eb9a060e54bf8d69288fbee4904 ""
There is no "feature" name for this. Corner case: a command to
replace a path with the empty tree is interpreted[2] as meaning to remove
that file or subtree, because git does not track empty directories.
Do the semantics seem reasonable? Should this get a corresponding
"feature"?
ls command
----------
A patch in flight[3] introduces an "ls" command to read directory
entries from the active commit or a named commit. This allows
printing a blob from the active commit or copying a blob or tree from
a previous commit for use in the current one.
It works like so:
1. Frontend writes
'ls' SP <path> LF
or
'ls' SP <dataref> SP <path> LF
In the first form, the <path> _must_ be surrounded in quotes
and quoted C-style. In the second form, the <dataref> can refer
to a tag, commit, or tree.
2. Backend replies through the cat-blob channel:
<mode> SP <type> SP <dataref> HT <path> LF
<mode> is a 6-digit octal mode: 040000, 100644, 100755,
120000, or 160000 for a directory, regular file, executable file,
symlink, or submodule, respectively.
<type> is 'blob', 'tree', or 'commit'.
<dataref> represents the corresponding blob, tree, or commit
object.
<path> is the path in question. It can be quoted C-style and
must be if the path starts with '"' or contains a newline.
3. Frontend reads the reply. The frontend might use that <dataref> in
a later filemodify (M) and cat-blob command.
Proposed updates to svn-fe[1] use this heavily and work well.
One ugly corner case: although it is intended to allow "missing
<path>" as a reply when the path is missing, the proposed patch
makes git fast-import use an empty tree to signal that case,
to ensure that, for example,
ls ""
M <mode> <dataref> ""
is always a non-operation.
No "feature" name yet. Even better, it's not part of git yet so
I invite to nitpick to your heart's content. Maybe you'd rather
the command be called "ls-tree" instead of "ls"? Ask away. :)
Thoughts welcome, as always.
Jonathan
[1] http://repo.or.cz/w/git/jrn.git/blob/refs/heads/vcs-svn-pu:/vcs-svn/svndump.c
[2] Or rather, is not interpreted but ought to be, or else
fast-import will make it too easy to produce invalid commits. One of
the patches in series [3] fixes it.
[3] http://thread.gmane.org/gmane.comp.version-control.git/162698/focus=164448
next prev parent reply other threads:[~2011-01-26 21:39 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-02 10:40 [PATCH/RFC] fast-import: add 'ls' command David Barr
2010-12-02 10:40 ` [PATCH] " David Barr
2010-12-02 12:37 ` Sverre Rabbelier
2010-12-02 12:57 ` David Michael Barr
2010-12-02 17:37 ` Jonathan Nieder
2010-12-02 19:20 ` Junio C Hamano
2010-12-02 22:51 ` David Barr
2011-01-03 8:01 ` [PATCH/RFC v2 0/3] " Jonathan Nieder
2011-01-03 8:22 ` [PATCH 1/3] fast-import: clarify handling of cat-blob feature Jonathan Nieder
2011-01-03 8:24 ` [PATCH 2/3] fast-import: treat filemodify with empty tree as delete Jonathan Nieder
2011-01-26 22:41 ` [PATCH v2] " Jonathan Nieder
2011-01-26 22:45 ` Sverre Rabbelier
2011-01-26 23:06 ` [PATCH jn/fast-import-fix v3] " Jonathan Nieder
2011-01-27 0:04 ` Junio C Hamano
2011-01-27 0:26 ` Jonathan Nieder
2011-01-27 6:07 ` [PATCH v4] " Jonathan Nieder
2011-01-27 19:33 ` Peter Baumann
2011-01-27 19:48 ` Jonathan Nieder
2011-01-27 20:46 ` Peter Baumann
2011-01-27 20:48 ` Peter Baumann
2011-01-28 17:13 ` Jonathan Nieder
2011-01-03 8:37 ` [PATCH 3/3] fast-import: add 'ls' command Jonathan Nieder
2011-01-26 21:39 ` Jonathan Nieder [this message]
2011-01-26 23:46 ` [RFC] fast-import: 'cat-blob' and 'ls' commands Sam Vilain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110126213922.GA19727@burratino \
--to=jrnieder@gmail.com \
--cc=artagnon@gmail.com \
--cc=david.barr@cordelta.com \
--cc=git@vger.kernel.org \
--cc=sam@vilain.net \
--cc=spearce@spearce.org \
--cc=srabbelier@gmail.com \
--cc=tom@dbservice.com \
--cc=vcs-fast-import-devs@lists.launchpad.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).