From: Felipe Contreras <felipe.contreras@gmail.com>
To: esr@thyrsus.com
Cc: git@vger.kernel.org
Subject: Re: gitpacker progress report and a question
Date: Tue, 27 Nov 2012 02:29:22 +0100 [thread overview]
Message-ID: <CAMP44s3HAzSPsrGwcpQpx_3n2aHK5wm++_7_Cbk3qRWMkxDh6g@mail.gmail.com> (raw)
In-Reply-To: <20121126234359.GA8042@thyrsus.com>
[-- Attachment #1: Type: text/plain, Size: 5609 bytes --]
On Tue, Nov 27, 2012 at 12:43 AM, Eric S. Raymond <esr@thyrsus.com> wrote:
> Felipe Contreras <felipe.contreras@gmail.com>:
>> Might be easier to just call 'git ls-files --with-three foo', but I
>> don't see the point of those calls:
>
> Ah, much is now explained. You were looking at an old version. I had
> in fact already fixed the subdirectories bug (I've updated my
> regression test to check) and have full support for branchy repos,
> preserving tags and branch heads.
So you are criticizing my code saying "it would then be almost
completely useless...", when this is in fact what you sent to the
list.
For the record, here is the output of a test with your script vs.
mine: the output is *exactly the same*:
---
== log ==
* afcbedc (tag: v0.2, master) bump
| * cbd2dce (devel) dev
|/
* 46f1813 (HEAD, test) remove
* df95e41 dot .
* ede0876 with
* d6f10fc extra
* e6362b1 (tag: v0.1) one
== files ==
file
== spaces ==
with
spaces
== dot ==
dot
.
== orig ref ==
refs/heads/test
== script ==
bc9a7d99132f97adeb5d2ca266bd3d8bc64ccb21 /home/felipec/Downloads/gitpacker.txt
Unpacking......(0.13 sec) done.
Packing......(0.28 sec) done.
== log ==
* 5d0b634 (HEAD, master) bump
* 2fe4a6d remove
* 0c27d3b dot .
* 5e36d3f with spaces
* d6f10fc extra
* e6362b1 one
== files ==
file
== spaces ==
with
spaces
== dot ==
dot
.
== orig ref ==
refs/heads/master
== script ==
33edcb28667b683fbb5f8782383f782f73c5e9e1 /home/felipec/bin/git-weave
== log ==
* afcbedc (HEAD, master) bump
* 46f1813 remove
* df95e41 dot .
* ede0876 with
* d6f10fc extra
* e6362b1 one
== files ==
file
== spaces ==
with
spaces
== dot ==
dot
.
== orig ref ==
refs/heads/test
---
Unfortunately, when I enable some testing stuff, this is what your
script throws:
---
== script ==
bc9a7d99132f97adeb5d2ca266bd3d8bc64ccb21 /home/felipec/Downloads/gitpacker.txt
Unpacking......(0.17 sec) done.
Packing......(0.02 sec) done.
Traceback (most recent call last):
File "/home/felipec/Downloads/gitpacker.txt", line 308, in <module>
git_pack(indir, outdir, quiet=quiet)
File "/home/felipec/Downloads/gitpacker.txt", line 171, in git_pack
command += " ".join(map(lambda p: "-p " + commit_id[int(p)],parents))
File "/home/felipec/Downloads/gitpacker.txt", line 171, in <lambda>
command += " ".join(map(lambda p: "-p " + commit_id[int(p)],parents))
IndexError: list index out of range
== log ==
fatal: bad default revision 'HEAD'
== files ==
fatal: tree-ish master not found.
== spaces ==
fatal: ambiguous argument ':/with': unknown revision or path not in
the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
== dot ==
fatal: ambiguous argument ':/dot': unknown revision or path not in the
working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
== orig ref ==
refs/heads/master
---
I'm attaching it in case you are interested.
Anyway, I can add support for branches and tags in no time, but I
wonder what's the point. Who will take so much time and effort to
generate all the branches and tags, and the log file?
If the goal is as you say "importing older projects that are available
only as sequences of release tarballs", then that code is overkill,
and it's not even making it easier to import the tarballs.
For that case my proposed format:
tag v0.1 gst-av-0.1.tar "Release 0.1"
tag v0.2 gst-av-0.2.tar "Release 0.2"
tag v0.3 gst-av-0.3.tar "Release 0.3"
Would be much more suitable.
>> > It doesn't issue delete ops.
>>
>> What do you mean?
>>
>> out.puts 'deleteall' <- All current files are removed
>
> Yours emits no D ops for files removed after a particular snapshot.
man git fast-import
---
This command is extremely useful if the frontend does not know (or
does not care to know) what files are currently on the branch, and
therefore cannot generate the proper filedelete commands to update the
content.
---
Why would I want to emit D operations, again, deleteall takes care of that.
>> > Be aware, however, that I consider easy editability by human beings
>> > much more important than squeezing the last microsecond out of the
>> > processing time. So, for example, I won't use data byte counts rather
>> > than end delimiters, the way import streams do.
>>
>> Well, if there's a line with a single dot in the commit message ('.'),
>> things would go very bad.
>
> Apparently you missed the part where I byte-stuffed the message content.
> It's a technique used in a lot of old-school Internet protocols, notably
> in SMTP.
You might have done that, but the user that generated the log file
might have not.
>> Personally I would prefer something like this:
>
> There's a certain elegance to that, but it would be hard to generate by hand.
You think this is hard to generate by hand:
---
tag v0.1 gst-av-0.1.tar "Release 0.1"
tag v0.2 gst-av-0.2.tar "Release 0.2"
tag v0.3 gst-av-0.3.tar "Release 0.3"
---
Than this?
---
commit 1
directory gst-av-0.1
Release 0.1
.
commit 2
directory gst-av-0.2
Release 0.2
.
commit 3
directory gst-av-0.3
Release 0.3
.
---
After of course, extracting the tarballs, which my script already does
automatically.
> Remember that a major use case for this tool is making repositories
> from projects whose back history exists only as tarballs.
Which is exactly what my script does, except even easier, because it
extracts the tarballs automatically.
> The main objective of the logfile design is to make hand-crafting
> these easy.
What does the above log file achieve, that my log file doesn't?
--
Felipe Contreras
[-- Attachment #2: test-gitpacker --]
[-- Type: application/octet-stream, Size: 2534 bytes --]
#!/bin/sh
rm -rf test test-unpacked* test-new*
test_date=1
test_subdir=1
test_tick () {
if test -z "${test_tick+set}"
then
test_tick=1112911993
else
test "$test_date" -eq 1 || \
test_tick=$(($test_tick + 60))
fi
GIT_COMMITTER_DATE="$test_tick -0700"
GIT_AUTHOR_DATE="$test_tick -0700"
export GIT_COMMITTER_DATE GIT_AUTHOR_DATE
}
(
git init -q test
cd test
echo one > file
git add file
test_tick
git commit -q -m one
git tag v0.1
echo extra > extra
git add extra
test_tick
git commit -q -m extra
echo spaces >> file
test_tick
git commit -q -a -m "$(echo -e "with\n\nspaces")"
echo dot >> file
test_tick
git commit -q -a -m "$(echo -e "dot\n.\n")"
if test "$test_subdir" -eq 1
then
mkdir subdir
echo subdir > subdir/file
git add subdir/file
test_tick
git commit -q -m dir
echo subdir2 >> file
test_tick
git commit -q -a -m subdir2
fi
git rm -q extra
test_tick
git commit -q -m remove
git checkout -q -b devel
echo dev >> file
test_tick
git commit -q -a -m dev
git checkout -q master
echo bump >> file
test_tick
git commit -q -a -m bump
git tag v0.2
git checkout -q -b test master^
echo "== log =="
git log --oneline --graph --decorate --all
echo "== files =="
git ls-files --with-tree master
echo "== spaces =="
git show --quiet --format='%B' :/with
echo "== dot =="
git show --quiet --format='%B' :/dot
)
echo "== orig ref =="
git --git-dir=test/.git symbolic-ref HEAD
git --git-dir=test/.git symbolic-ref HEAD refs/heads/test
script="/home/felipec/Downloads/gitpacker.txt"
echo
echo "== script =="
sha1sum $script
$PYTHON_PATH $script -x -i test -o test-unpacked-1
$PYTHON_PATH $script -c -i test-unpacked-1 -o test-new-1
(
cd test-new-1
echo "== log =="
git log --oneline --graph --decorate --all
echo "== files =="
git ls-files --with-tree master
echo "== spaces =="
git show --quiet --format='%B' :/with
echo "== dot =="
git show --quiet --format='%B' :/dot
)
echo "== orig ref =="
git --git-dir=test/.git symbolic-ref HEAD
git --git-dir=test/.git symbolic-ref HEAD refs/heads/test
script="$HOME/bin/git-weave"
echo
echo "== script =="
sha1sum $script
$script -x -i test -o test-unpacked-2
$script -c -i test-unpacked-2 -o test-new-2
(
cd test-new-2
echo "== log =="
git log --oneline --graph --decorate --all
echo "== files =="
git ls-files --with-tree master
echo "== spaces =="
git show --quiet --format='%B' :/with
echo "== dot =="
git show --quiet --format='%B' :/dot
)
echo "== orig ref =="
git --git-dir=test/.git symbolic-ref HEAD
git --git-dir=test/.git symbolic-ref HEAD refs/heads/test
next prev parent reply other threads:[~2012-11-27 1:29 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-15 21:28 gitpacker progress report and a question Eric S. Raymond
2012-11-15 22:35 ` Max Horn
2012-11-15 23:05 ` Eric S. Raymond
2012-11-16 13:13 ` Andreas Schwab
2012-11-26 20:07 ` Felipe Contreras
2012-11-26 22:01 ` Eric S. Raymond
2012-11-26 23:14 ` Felipe Contreras
2012-11-26 23:43 ` Eric S. Raymond
2012-11-27 1:29 ` Felipe Contreras [this message]
2012-11-27 1:38 ` Felipe Contreras
2012-11-27 6:29 ` Felipe Contreras
2012-11-27 7:27 ` Eric S. Raymond
2012-11-27 8:20 ` Felipe Contreras
2012-11-27 8:36 ` Eric S. Raymond
2012-11-27 8:51 ` Felipe Contreras
2012-11-27 7:30 ` Eric S. Raymond
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMP44s3HAzSPsrGwcpQpx_3n2aHK5wm++_7_Cbk3qRWMkxDh6g@mail.gmail.com \
--to=felipe.contreras@gmail.com \
--cc=esr@thyrsus.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).