* Re: Re: [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer
@ 2007-02-16 12:53 Mark Levedahl
2007-02-16 19:57 ` Junio C Hamano
2007-02-16 21:58 ` [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer Mark Levedahl
0 siblings, 2 replies; 42+ messages in thread
From: Mark Levedahl @ 2007-02-16 12:53 UTC (permalink / raw)
To: Junio C Hamano, Mark Levedahl; +Cc: git
>>
>Also as Shawn pointed out, the script too heavily depends on GNU
>tar. Can we do something about it?
Let me ponder, I'm sure I can do something. As I noted in another response, Cygwin won't let me reliably pipe the pack file through bash, so I'm forced to use some archiver.
Mark
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer 2007-02-16 12:53 Re: [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer Mark Levedahl @ 2007-02-16 19:57 ` Junio C Hamano 2007-02-16 23:21 ` Mark Levedahl 2007-02-17 14:50 ` Mark Levedahl 2007-02-16 21:58 ` [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer Mark Levedahl 1 sibling, 2 replies; 42+ messages in thread From: Junio C Hamano @ 2007-02-16 19:57 UTC (permalink / raw) To: Mark Levedahl; +Cc: git Mark Levedahl <mdl123@verizon.net> writes: >>> >>Also as Shawn pointed out, the script too heavily depends on GNU >>tar. Can we do something about it? > > Let me ponder, I'm sure I can do something. As I noted in > another response, Cygwin won't let me reliably pipe the pack > file through bash, so I'm forced to use some archiver. Mark, how urgent do you want to have "bundle" in my tree? As I understand it, this came out of your working zip based implementation your group already use, so I am suspecting that you do not have urgent need to have a different one in git.git in a half-baked shape. The reason I say this is because I very much in favor of the general idea of an archive file that can be used to sneakernet repository updates, but I find the current bundle-unbundle interface a bit awkward to integrate into the rest of the system. Wouldn't it be nice if you can treat a bundle as just a different kind of git URL that you can "git fetch"? $ git fetch file.bdl 'refs/heads/*:refs/heads/*' As the way I read the current implementation of git-unbundle is that it is designed to only overwrite (with fast-forward check not to lose changes from the target) all the refs in place. It does not, for example, allow extracting only one branch, nor storing the branch in a tracking branch by renaming. I think it would be nicer to change the external interface from the bundle subsystem to the calling scripts, so that it behaves closer to git-fetch-pack. I haven't thought things through, but I think something like this would be sufficient: # create a bundle. $ git bundle --create v1.5.0..maint master next >file.bdl # you can view the tips of refs it contains. This is # similar to "ls-remote" output. $ git bundle --list-heads file.bdl efa13f7b7ea1605deab3a6478fa0d0706c828170 refs/heads/maint af99711cd84c30a16450f73dbc21ba9f9f9803e6 refs/heads/master 664e83a22b604fc5af1a84ddd48549b005cf4bc9 refs/heads/next # optionally allow checking if the pre-requisites of the # bundle are available. $ git bundle --verify file.bdl error: file.bdl depends on the following commits you lack in the repository: 82bf92f9846326a743102e27fa9827422dddfada v1.5.0-564-g82bf92f 4d462883de41190afd23672c8236361c61a9e6bd v1.5.0-34-g4d46288 # extract the packfile part, run index-pack and report # the result the same way as git-fetch-pack $ git bundle --unbundle file.bdl pack 340e3faa26616ca8c38b369a6d323ecf7a34f4fb efa13f7b7ea1605deab3a6478fa0d0706c828170 refs/heads/maint af99711cd84c30a16450f73dbc21ba9f9f9803e6 refs/heads/master 664e83a22b604fc5af1a84ddd48549b005cf4bc9 refs/heads/next Then git-ls-remote can be taught about a bundle file and use the 'git bundle --list-heads'. Also, with something like this in your config: [remote "bundle"] url = /home/me/tmp/file.bdl fetch = refs/heads/*:refs/remotes/origin/* You can first sneakernet the bundle file to ~/tmp/file.bdl and then these commands: $ git ls-remote bundle $ git fetch bundle $ git pull bundle would treat it as if it is talking with a remote side over the network. Hmm? ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer 2007-02-16 19:57 ` Junio C Hamano @ 2007-02-16 23:21 ` Mark Levedahl 2007-02-17 6:57 ` Junio C Hamano 2007-02-17 14:50 ` Mark Levedahl 1 sibling, 1 reply; 42+ messages in thread From: Mark Levedahl @ 2007-02-16 23:21 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Junio C Hamano wrote: > Mark, how urgent do you want to have "bundle" in my tree? As I > understand it, this came out of your working zip based > implementation your group already use, so I am suspecting that > you do not have urgent need to have a different one in git.git > in a half-baked shape. > I can get my work done today with what I have. However, I really believe bundle is a good addition to git, and I don't want to maintain an out-of-tree patch to git to add this capability. > Wouldn't it be nice if you can treat a bundle as just a > different kind of git URL that you can "git fetch"? > > $ git fetch file.bdl 'refs/heads/*:refs/heads/*' > yes > > Then git-ls-remote can be taught about a bundle file and use the > 'git bundle --list-heads'. Also, with something like this in > your config: > > [remote "bundle"] > url = /home/me/tmp/file.bdl > fetch = refs/heads/*:refs/remotes/origin/* > > You can first sneakernet the bundle file to ~/tmp/file.bdl and > then these commands: > > $ git ls-remote bundle > $ git fetch bundle > $ git pull bundle > > would treat it as if it is talking with a remote side over the > network. > > Hmm? > > As long as I can still do a "git fetch file.bdl" and without having to do the config stuff. I'm happy. Integrating this bundle with basic git approaches to things is obviously good. The frontend you outlined seems a trivial rearrangement of what I already have, but I'll let this discussion progress a bit further before I start doing that. Mark ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer 2007-02-16 23:21 ` Mark Levedahl @ 2007-02-17 6:57 ` Junio C Hamano 2007-02-17 14:40 ` Mark Levedahl 0 siblings, 1 reply; 42+ messages in thread From: Junio C Hamano @ 2007-02-17 6:57 UTC (permalink / raw) To: Mark Levedahl; +Cc: git Mark Levedahl <mdl123@verizon.net> writes: >> Wouldn't it be nice if you can treat a bundle as just a >> different kind of git URL that you can "git fetch"? >> >> $ git fetch file.bdl 'refs/heads/*:refs/heads/*' >> > yes >... > > As long as I can still do a "git fetch file.bdl" and without having to > do the config stuff. I'm happy. I do not think your patch (original or respun) checks if it is overwriting the current branch. Even if it is a fast forward, it should check this condition and prevent the end user from getting confused. The above sample command line you quoted from my message can potentially have the same problem, but "git fetch" checks and refuses. Even when the "remote" is not a bundle file, it sometimes is useful to store the refs as they are without mapping them to remotes hierarchy as the separate-remote layout does. An obvious use case is to mirror another repository as is. So it may make sense to teach --mirror to "git-fetch" so that you can say: $ git fetch --mirror file.bdl which would be a short-hand to say: $ git fetch file.bdl 'refs/*:refs/*' A final note. A real 'mirror' mode should also remove stale refs that do not exist on the remote side anymore, which is a different use case as your bundle, which presumably is primarily meant to carry not all but only selected set of refs, and most likely not the 'master' branch head (and I am guessing that that is why you forgot to make sure you are not overwriting the current branch in the unbundle script). A real 'mirror' mode would use a separate option to remove a ref that does not exist on the remote end anymore, like: $ git fetch --mirror-all git://git.kernel.org/pub/scm/git/git.git/ ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer 2007-02-17 6:57 ` Junio C Hamano @ 2007-02-17 14:40 ` Mark Levedahl 2007-02-17 17:53 ` Junio C Hamano 0 siblings, 1 reply; 42+ messages in thread From: Mark Levedahl @ 2007-02-17 14:40 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Junio C Hamano wrote: > I do not think your patch (original or respun) checks if it is > overwriting the current branch. Even if it is a fast forward, > it should check this condition and prevent the end user from > getting confused. The above sample command line you quoted from > my message can potentially have the same problem, but "git > fetch" checks and refuses. > True. I've been using this to mirror remote/* and tags. Making this just an alternate transport behind fetch and pull is clearly the cleanest way to deal with non-remote branches. > A final note. A real 'mirror' mode should also remove stale > refs that do not exist on the remote side anymore, which is a > different use case as your bundle, which presumably is primarily > meant to carry not all but only selected set of refs, and most > likely not the 'master' branch head (and I am guessing that that > is why you forgot to make sure you are not overwriting the > current branch in the unbundle script). A real 'mirror' mode > would use a separate option to remove a ref that does not exist > on the remote end anymore, like: > > $ git fetch --mirror-all git://git.kernel.org/pub/scm/git/git.git > Perhaps "git fetch --mirror --delete" would be more suggestive of the difference to "git fetch --mirror"? Mark ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer 2007-02-17 14:40 ` Mark Levedahl @ 2007-02-17 17:53 ` Junio C Hamano 0 siblings, 0 replies; 42+ messages in thread From: Junio C Hamano @ 2007-02-17 17:53 UTC (permalink / raw) To: Mark Levedahl; +Cc: git Mark Levedahl <mdl123@verizon.net> writes: > Junio C Hamano wrote: > ... >> A final note. A real 'mirror' mode should also remove stale >> refs that do not exist on the remote side anymore, which is a >> different use case as your bundle,... >> ... A real 'mirror' mode >> would use a separate option to remove a ref that does not exist >> on the remote end anymore, like: >> >> $ git fetch --mirror-all git://git.kernel.org/pub/scm/git/git.git >> > Perhaps "git fetch --mirror --delete" would be more suggestive of the > difference to "git fetch --mirror"? I think that is a good suggestion, as people familiar with rsync already know what --delete means in such a context. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer 2007-02-16 19:57 ` Junio C Hamano 2007-02-16 23:21 ` Mark Levedahl @ 2007-02-17 14:50 ` Mark Levedahl 2007-02-17 18:41 ` Junio C Hamano 1 sibling, 1 reply; 42+ messages in thread From: Mark Levedahl @ 2007-02-17 14:50 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Junio C Hamano wrote: > I haven't thought things through, but I think something like this > would be sufficient: > > # create a bundle. > $ git bundle --create v1.5.0..maint master next >file.bdl > > In this vein, would it make sense to let git-push be the front end to create the bundle? I'm not sure git-push really contributes anything, but the interface would then be consistent across all transports. Just a thought. Mark ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer 2007-02-17 14:50 ` Mark Levedahl @ 2007-02-17 18:41 ` Junio C Hamano 2007-02-18 17:27 ` [PATCH] Add git-bundle: move objects and references by archive Mark Levedahl 0 siblings, 1 reply; 42+ messages in thread From: Junio C Hamano @ 2007-02-17 18:41 UTC (permalink / raw) To: Mark Levedahl; +Cc: git Mark Levedahl <mdl123@verizon.net> writes: > Junio C Hamano wrote: >> I haven't thought things through, but I think something like this >> would be sufficient: >> >> # create a bundle. >> $ git bundle --create v1.5.0..maint master next >file.bdl >> >> > In this vein, would it make sense to let git-push be the front end to > create the bundle? I'm not sure git-push really contributes anything, > but the interface would then be consistent across all transports. Just > a thought. I am not sure. For the "fetch" case, the UI will be exactly the same (URL and then refspecs), but we never say bottom commits for git-push. When you are creating a bundle, unless you are including everything, you want a way to say which bottom commits to be excluded from the resulting pack, so they are different. I had an impression bundle creation is similar to archive. ^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH] Add git-bundle: move objects and references by archive. 2007-02-17 18:41 ` Junio C Hamano @ 2007-02-18 17:27 ` Mark Levedahl 2007-02-18 22:47 ` Mark Levedahl 0 siblings, 1 reply; 42+ messages in thread From: Mark Levedahl @ 2007-02-18 17:27 UTC (permalink / raw) To: junkio, mdl123; +Cc: git, Mark Levedahl Some workflows require use of repositories on machines that cannot be connected, preventing use of git-fetch / git-push to transport objects and references between the repositories. git-bundle provides an alternate transport mechanism, effectively allowing git-fetch and git-pull to operate using sneakernet transport. git-bundle --create allows the user to create a bundle containing one or more branches or tags, but with specified basis assumed to exist on the target repository. At the receiving end, git-bundle acts like git-fetch-pack, allowing the user to invoke git-fetch or git-pull using the bundle file as the URL. git-fetch and git-ls-remote determine they have a bundle URL by checking that the URL points to a file, but are otherwise unchanged in operation with bundles. Signed-off-by: Mark Levedahl <mdl123@verizon.net> --- Documentation/cmd-list.perl | 1 + Documentation/git-bundle.txt | 146 +++++++++++++++++++++++++++++++++++ Makefile | 3 +- git-bundle.sh | 174 ++++++++++++++++++++++++++++++++++++++++++ git-fetch.sh | 11 ++- git-ls-remote.sh | 7 ++- 6 files changed, 338 insertions(+), 4 deletions(-) mode change 100755 => 100644 Documentation/cmd-list.perl create mode 100644 Documentation/git-bundle.txt create mode 100755 git-bundle.sh diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl old mode 100755 new mode 100644 index a2d6268..f61c77a --- a/Documentation/cmd-list.perl +++ b/Documentation/cmd-list.perl @@ -70,6 +70,7 @@ git-archive mainporcelain git-bisect mainporcelain git-blame ancillaryinterrogators git-branch mainporcelain +git-bundle mainporcelain git-cat-file plumbinginterrogators git-checkout-index plumbingmanipulators git-checkout mainporcelain diff --git a/Documentation/git-bundle.txt b/Documentation/git-bundle.txt new file mode 100644 index 0000000..27db785 --- /dev/null +++ b/Documentation/git-bundle.txt @@ -0,0 +1,146 @@ +git-bundle(1) +============= + +NAME +---- +git-bundle - Move objects and refs by archive + + +SYNOPSIS +-------- +'git-bundle' --create file <git-rev-list args> <--tar tarspec> +'git-bundle' --verify file <--tar tarspec> +'git-bundle' --list-heads file <reflist> <--tar tarspec> +'git-bundle' --unbundle file <reflist> <--tar tarspec> + +DESCRIPTION +----------- + +Some workflows require that one or more branches of development on one +machine be replicated on another machine, but the two machines cannot +be directly connected so the interactive git protocols (git, ssh, +rsync, http) cannot be used. This command provides suport for +git-fetch and git-pull to operate by packaging objects and references +in an archive at the originating machine, then importing those into +another repository using gitlink:git-fetch[1] and gitlink:git-pull[1] +after moving the archive by some means (i.e., by sneakernet). As no +direct connection between repositories exists, the user must specify a +basis for the bundle that is held by the destination repository: the +bundle assumes that all objects in the basis are already in the +destination repository. + +OPTIONS +------- + +--create file:: + Used to create a bundle named 'file'. This requires the + git-rev-list arguments to define the bundle contents. + +--verify file:: + Used to check that a bundle file is valid and will apply + cleanly to the current repository. This includes checks on the + bundle format itself as well as checking that the prerequisite + commits exist and are fully linked in the current repository. + git-bundle prints a list of missing commits, if any, and exits + with non-zero status. + +--list-heads file:: + Lists the references defined in the bundle. If followed by a + list of references, only references matching those given are + printed out. + +--unbundle file:: + Passes the objects in the bundle to gitlink:git-index-pack[1] + for storage in the repository, then prints the names of all + defined references. If a reflist is given, only references + matching those in the given list are printed. This command is + really plumbing, intended to be called only by + gitlink:git-fetch[1]. + +git-rev-list args:: + A list of arguments, accepatble to git-rev-parse and + git-rev-list, that specify the specific objects and references + to transport. For example, "master~10..master" causes the + current master reference to be packaged along with all objects + added since its 10th ancestor commit. There is no explicit + limit to the number of references and objects that may be + packaged. + + +reflist:: + A list of references used to limit the references reported as + available. This is principally of use to git-fetch, which + expects to recieve only those references asked for and not + necessarily everything in the pack (in this case, git-bundle is + acting like gitlink:git-fetch-pack[1]). + +tar tarspec:: + + git-bundle uses tar, and requires a tar supporting c, f, and + -O. By default, git-bundle looks for gtar on the path, then for + tar if gtar is not found. This can be overridden by explicitly + defining tarspec, or by defining TAR in the environment. + +SPECIFYING REFERENCES +--------------------- + +git-bundle will only package references that are shown by +git-show-ref: this includes heads, tags, and remote heads. References +such as master~1 cannot be packaged, but are perfectly suitable for +defining the basis. More than one reference may be packaged, and more +than one basis can be specified. The objects packaged are those not +contained in the union of the given bases. Each basis can be +specified explicitly (e.g., ^master~10), or implicitly (e.g., +master~10..master, master --since=10.days.ago). + +It is very important that the basis used be held by the destination. +It is ok to err on the side of conservatism, causing the bundle file +to contain objects already in the destination as these are ignored +when unpacking at the destination. + +EXAMPLE +------- + +Assume two repositories exist as R1 on machine A, and R2 on machine B. +For whatever reason, direct connection between A and B is not allowed, +but we can move data from A to B via some mechanism (CD, email, etc). +We want to update R2 with developments made on branch master in R1. +We set a tag in R1 (lastR2bundle) after the previous such transport, +and move it afterwards to help build the bundle. + +in R1 on A: +$ git-bundle --create mybundle master ^lastR2bundle +$ git tag -f lastR2bundle master + +(move mybundle from A to B by some mechanism) + +in R2 on B: +$ git-bundle --verify mybundle +$ git-fetch mybundle refspec + +where refspec is refInBundle:localRef + + +Also, with something like this in your config: + +[remote "bundle"] + url = /home/me/tmp/file.bdl + fetch = refs/heads/*:refs/remotes/origin/* + +You can first sneakernet the bundle file to ~/tmp/file.bdl and +then these commands: + +$ git ls-remote bundle +$ git fetch bundle +$ git pull bundle + +would treat it as if it is talking with a remote side over the +network. + +Author +------ +Written by Mark Levedahl <mdl123@verizon.net> + +GIT +--- +Part of the gitlink:git[7] suite diff --git a/Makefile b/Makefile index ebecbbd..c6d540e 100644 --- a/Makefile +++ b/Makefile @@ -177,7 +177,8 @@ SCRIPT_SH = \ git-applymbox.sh git-applypatch.sh git-am.sh \ git-merge.sh git-merge-stupid.sh git-merge-octopus.sh \ git-merge-resolve.sh git-merge-ours.sh \ - git-lost-found.sh git-quiltimport.sh + git-lost-found.sh git-quiltimport.sh \ + git-bundle.sh SCRIPT_PERL = \ git-add--interactive.perl \ diff --git a/git-bundle.sh b/git-bundle.sh new file mode 100755 index 0000000..2bb9232 --- /dev/null +++ b/git-bundle.sh @@ -0,0 +1,174 @@ +#!/bin/sh +# Basic handler for bundle files to connect repositories via sneakernet. +# Invocation must include action. +# This function can create a bundle or provide information on an existing bundle +# supporting git-fetch, git-pull, and git-ls-remote + +USAGE='[--create bundle <git-rev-list args>] | +[--verify|--list-heads|--unbundle bundle] <--tar tarspec> + where bundle is the name of the bundle file.' + +verify() { + # Check bundle version + test -r "$bfile" || die "cannot find $bfile" + test "$($TAR -xf ""$bfile"" -O version)" = "v1 git-bundle" || + die "$bfile doesn't look like a v1 bundle file." + + # do fast check, then if any prereqs are missing then go line by line + # to be verbose about the errors + prereqs=$($TAR xf "$bfile" -O prerequisites) + test -z "$prereqs" && return 0 + bad=$(echo "$prereqs" | cut -b-40 | git-rev-list --stdin --not --all 2>&1) + if test -n "$bad" ; then + test "$1" = "--silent" && return 1 + echo "error: $bfile requires the following commits you lack:" + echo "$prereqs" | + while read sha1 comment ; do + missing=$(git-rev-list $sha1 --not --all 2>&1) + test -n "$missing" && echo "$sha1 $comment" + done + exit 1 + fi + return 0 +} + +# list all or just a subset +list_heads() { + if test -z "$*" ; then + $TAR -xf "$bfile" -O references 2>/dev/null || exit 1 + else + ($TAR -xf "$bfile" -O references 2>/dev/null || exit 1) | + while read sha1 ref ; do + for arg in $* ; do + if test "${ref%$arg}" != "$ref" ; then + echo "$sha1 $ref" + break + fi + done + done + fi +} + +SUBDIRECTORY_OK=1 +. git-sh-setup + +args= +action= +while test -n "$1" ; do + case $1 in + --create|--list-heads|--unbundle|--verify) + action=${1#--} + shift + bfile=$1 + test -z "$bfile" && die "$action requires filename";; + --tar=*) + TAR=${1##--tar=};; + --tar) + shift + TAR=$1;; + *) + args="$args $1";; + esac + shift +done +test -z "$action" && die "No action given, what should I do?" + +# what tar to use, prefer gtar, then tar. +if test -z "$TAR" ; then + GTAR=$(which gtar 2>/dev/null) + TAR=${GTAR:-tar} +fi + +case $action in +create) + unknown=$(git-rev-parse --no-revs $args) + test -z "$unknown" || die "unknown option: $unknown" + gitrevargs=$(git-rev-parse --symbolic --revs-only $args) || exit 1 + + # find the refs to carry along and get sha1s for each. + refs= + fullrevargs= + for arg in $gitrevargs ; do + #ignore options and basis refs, get full ref name for things + # we will transport rejecting anything ambiguous (e.g., user + # gives master, have heads/master and remotes/origin/master, we + # keep the former). + case "$arg" in + -* | ^*) fullrevargs="$fullrevargs $arg";; + *) ref=$(git-show-ref "$arg") + test "$(echo $ref | wc -w)" = "2" || die "Ambigous reference: $arg +$ref" + fullrevargs="$fullrevargs ${ref#* }" + refs="$refs $ref";; + esac + done + test -z "$refs" && die "No references specified, I don't know what to bundle." + + # git-rev-list cannot determine edge objects if a date restriction is + # given... we do things a slow way if max-age or min-age are given + case "$fullrevargs" in + *--max-age* | *--min-age*) + # get a list of all commits that will be packed along with + # parents of each. A fixed git-rev-list --boundary should + # replace all of this. + echo "Finding prerequisites and commits to bundle..." + commits=$(git-rev-list $fullrevargs) + + # get immediate parents of each commit to include + parents= + for c in $commits ; do + parents="$parents $(git-rev-list --parents --max-count=1 $c | cut -b42-)" + done + parents=$(printf "%s\n" $parents | sort | uniq) + + # factor out what will be in this bundle, the remainder are the + # bundle's prerequisites. double up commits in this as we only + # want things that are only in parents to appear once + prereqs=$(printf "%s\n" $parents $commits $commits | \ + sort | uniq -c | sed -ne 's/^ *1 //p');; + *) + prereqs=$(git-rev-list --objects-edge $fullrevargs | sed -ne 's/^-//p');; + esac + + # replace prereqs with annotated version of same + + # create refs and pack + tmp="$GIT_DIR/bundle_tmp$$" + prerequisites="$tmp/prerequisites" + references="$tmp/references" + version="$tmp/version" + pack="$tmp/pack" + trap 'rm -rf "$tmp"' 0 1 2 3 15 + + mkdir "$tmp" && + echo "v1 git-bundle" > "$version" && + touch "$prerequisites" && + (for p in $prereqs ; do + git-rev-list --pretty=one --max-count=1 $p + done) > "$prerequisites" && + git-show-ref $refs > "$references" && + git-rev-list --objects $fullrevargs | cut -b-40 | + git pack-objects --all-progress --stdout > "$pack" && + + # create the tar file, clean up + cd "$tmp" && + tar cf bundle prerequisites references version pack && + cd - && + mv "$tmp/bundle" "$bfile" && + rm -rf "$tmp" + + # done + echo "Created $bfile";; + +verify) + verify && echo "$bfile is ok";; + +list-heads) + list_heads $args;; + +unbundle) + verify --silent || exit 1 + $TAR -xf "$bfile" -O pack | git-index-pack --stdin || + die "error: $bfile has a corrupted pack file" + list_heads $args;; +esac diff --git a/git-fetch.sh b/git-fetch.sh index ca984e7..4cb4009 100755 --- a/git-fetch.sh +++ b/git-fetch.sh @@ -377,8 +377,15 @@ fetch_main () { ( : subshell because we muck with IFS IFS=" $LF" ( - git-fetch-pack --thin $exec $keep $shallow_depth "$remote" $rref || - echo failed "$remote" + if test -r "$remote" ; then + test -n "$shallow_depth" && + die "shallow clone with bundle is not supported" + git-bundle --unbundle "$remote" $rref || + echo failed "$remote" + else + git-fetch-pack --thin $exec $keep $shallow_depth "$remote" $rref || + echo failed "$remote" + fi ) | ( trap ' diff --git a/git-ls-remote.sh b/git-ls-remote.sh index 8ea5c5e..73b745d 100755 --- a/git-ls-remote.sh +++ b/git-ls-remote.sh @@ -89,8 +89,13 @@ rsync://* ) ;; * ) - git-peek-remote $exec "$peek_repo" || + if test -r "$peek_repo" ; then + git bundle --list-heads "$peek_repo" || echo "failed slurping" + else + git-peek-remote $exec "$peek_repo" || + echo "failed slurping" + fi ;; esac | sort -t ' ' -k 2 | -- 1.5.0.rc4.375.gd0938-dirty ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive. 2007-02-18 17:27 ` [PATCH] Add git-bundle: move objects and references by archive Mark Levedahl @ 2007-02-18 22:47 ` Mark Levedahl 0 siblings, 0 replies; 42+ messages in thread From: Mark Levedahl @ 2007-02-18 22:47 UTC (permalink / raw) To: junkio; +Cc: git Mark Levedahl wrote: > - git-peek-remote $exec "$peek_repo" || > + if test -r "$peek_repo" ; then > oops: test -f actually works. Corrected patch follows. Mark ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer 2007-02-16 12:53 Re: [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer Mark Levedahl 2007-02-16 19:57 ` Junio C Hamano @ 2007-02-16 21:58 ` Mark Levedahl 2007-02-16 22:51 ` Johannes Schindelin 1 sibling, 1 reply; 42+ messages in thread From: Mark Levedahl @ 2007-02-16 21:58 UTC (permalink / raw) To: Mark Levedahl; +Cc: Junio C Hamano, git Mark Levedahl wrote: >> Also as Shawn pointed out, the script too heavily depends on GNU >> tar. Can we do something about it >> This one is easy, make tmp a directory, then build the tar file in that directory so the archive members don't include the tmp name, then just move the tar file to where it is needed... tmp="$GIT_DIR/bundle_tmp$$" references="$tmp/references" pack="$tmp/pack" trap 'rm -rf "$tmp"' 0 1 2 3 15 mkdir "$tmp" && echo "v1 git-bundle" > "$version" && ------ # create the tar file, clean up cd "$tmp" && tar cf bundle prerequisites references version pack && cd - && mv "$tmp/bundle" "$bfile" && rm -rf "$tmp" I believe that as this works, tar needs only to understand c, f, and -O. It is easy enough to search for gtar and use that, or allow user to define TAR. So, I think this is a non-issue. Mark ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer 2007-02-16 21:58 ` [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer Mark Levedahl @ 2007-02-16 22:51 ` Johannes Schindelin 0 siblings, 0 replies; 42+ messages in thread From: Johannes Schindelin @ 2007-02-16 22:51 UTC (permalink / raw) To: Mark Levedahl; +Cc: Junio C Hamano, git Hi, On Fri, 16 Feb 2007, Mark Levedahl wrote: > Mark Levedahl wrote: > > > Also as Shawn pointed out, the script too heavily depends on GNU > > > tar. Can we do something about it > > > > This one is easy, make tmp a directory, then build the tar file in that > directory so the archive members don't include the tmp name, then just move > the tar file to where it is needed... Why not just get rid of this unneeded dependency? It is _unneeded_. Maybe you never suffered dependency hell (this is the only way I can explain your resistance). It is _good_ to get rid of dependencies. Ah whatever, if you don't just get rid of that tar dependency, I'll just do it. Ciao, Dscho ^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH] Add git-bundle: move objects and references by archive. @ 2007-02-18 22:47 Mark Levedahl 2007-02-19 1:07 ` Johannes Schindelin 2007-02-19 1:49 ` Junio C Hamano 0 siblings, 2 replies; 42+ messages in thread From: Mark Levedahl @ 2007-02-18 22:47 UTC (permalink / raw) To: junkio; +Cc: git, Mark Levedahl Some workflows require use of repositories on machines that cannot be connected, preventing use of git-fetch / git-push to transport objects and references between the repositories. git-bundle provides an alternate transport mechanism, effectively allowing git-fetch and git-pull to operate using sneakernet transport. git-bundle --create allows the user to create a bundle containing one or more branches or tags, but with specified basis assumed to exist on the target repository. At the receiving end, git-bundle acts like git-fetch-pack, allowing the user to invoke git-fetch or git-pull using the bundle file as the URL. git-fetch and git-ls-remote determine they have a bundle URL by checking that the URL points to a file, but are otherwise unchanged in operation with bundles. Signed-off-by: Mark Levedahl <mdl123@verizon.net> --- Documentation/cmd-list.perl | 1 + Documentation/git-bundle.txt | 146 +++++++++++++++++++++++++++++++++++ Makefile | 3 +- git-bundle.sh | 174 ++++++++++++++++++++++++++++++++++++++++++ git-fetch.sh | 11 ++- git-ls-remote.sh | 7 ++- 6 files changed, 338 insertions(+), 4 deletions(-) mode change 100755 => 100644 Documentation/cmd-list.perl create mode 100644 Documentation/git-bundle.txt create mode 100755 git-bundle.sh diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl old mode 100755 new mode 100644 index a2d6268..f61c77a --- a/Documentation/cmd-list.perl +++ b/Documentation/cmd-list.perl @@ -70,6 +70,7 @@ git-archive mainporcelain git-bisect mainporcelain git-blame ancillaryinterrogators git-branch mainporcelain +git-bundle mainporcelain git-cat-file plumbinginterrogators git-checkout-index plumbingmanipulators git-checkout mainporcelain diff --git a/Documentation/git-bundle.txt b/Documentation/git-bundle.txt new file mode 100644 index 0000000..27db785 --- /dev/null +++ b/Documentation/git-bundle.txt @@ -0,0 +1,146 @@ +git-bundle(1) +============= + +NAME +---- +git-bundle - Move objects and refs by archive + + +SYNOPSIS +-------- +'git-bundle' --create file <git-rev-list args> <--tar tarspec> +'git-bundle' --verify file <--tar tarspec> +'git-bundle' --list-heads file <reflist> <--tar tarspec> +'git-bundle' --unbundle file <reflist> <--tar tarspec> + +DESCRIPTION +----------- + +Some workflows require that one or more branches of development on one +machine be replicated on another machine, but the two machines cannot +be directly connected so the interactive git protocols (git, ssh, +rsync, http) cannot be used. This command provides suport for +git-fetch and git-pull to operate by packaging objects and references +in an archive at the originating machine, then importing those into +another repository using gitlink:git-fetch[1] and gitlink:git-pull[1] +after moving the archive by some means (i.e., by sneakernet). As no +direct connection between repositories exists, the user must specify a +basis for the bundle that is held by the destination repository: the +bundle assumes that all objects in the basis are already in the +destination repository. + +OPTIONS +------- + +--create file:: + Used to create a bundle named 'file'. This requires the + git-rev-list arguments to define the bundle contents. + +--verify file:: + Used to check that a bundle file is valid and will apply + cleanly to the current repository. This includes checks on the + bundle format itself as well as checking that the prerequisite + commits exist and are fully linked in the current repository. + git-bundle prints a list of missing commits, if any, and exits + with non-zero status. + +--list-heads file:: + Lists the references defined in the bundle. If followed by a + list of references, only references matching those given are + printed out. + +--unbundle file:: + Passes the objects in the bundle to gitlink:git-index-pack[1] + for storage in the repository, then prints the names of all + defined references. If a reflist is given, only references + matching those in the given list are printed. This command is + really plumbing, intended to be called only by + gitlink:git-fetch[1]. + +git-rev-list args:: + A list of arguments, accepatble to git-rev-parse and + git-rev-list, that specify the specific objects and references + to transport. For example, "master~10..master" causes the + current master reference to be packaged along with all objects + added since its 10th ancestor commit. There is no explicit + limit to the number of references and objects that may be + packaged. + + +reflist:: + A list of references used to limit the references reported as + available. This is principally of use to git-fetch, which + expects to recieve only those references asked for and not + necessarily everything in the pack (in this case, git-bundle is + acting like gitlink:git-fetch-pack[1]). + +tar tarspec:: + + git-bundle uses tar, and requires a tar supporting c, f, and + -O. By default, git-bundle looks for gtar on the path, then for + tar if gtar is not found. This can be overridden by explicitly + defining tarspec, or by defining TAR in the environment. + +SPECIFYING REFERENCES +--------------------- + +git-bundle will only package references that are shown by +git-show-ref: this includes heads, tags, and remote heads. References +such as master~1 cannot be packaged, but are perfectly suitable for +defining the basis. More than one reference may be packaged, and more +than one basis can be specified. The objects packaged are those not +contained in the union of the given bases. Each basis can be +specified explicitly (e.g., ^master~10), or implicitly (e.g., +master~10..master, master --since=10.days.ago). + +It is very important that the basis used be held by the destination. +It is ok to err on the side of conservatism, causing the bundle file +to contain objects already in the destination as these are ignored +when unpacking at the destination. + +EXAMPLE +------- + +Assume two repositories exist as R1 on machine A, and R2 on machine B. +For whatever reason, direct connection between A and B is not allowed, +but we can move data from A to B via some mechanism (CD, email, etc). +We want to update R2 with developments made on branch master in R1. +We set a tag in R1 (lastR2bundle) after the previous such transport, +and move it afterwards to help build the bundle. + +in R1 on A: +$ git-bundle --create mybundle master ^lastR2bundle +$ git tag -f lastR2bundle master + +(move mybundle from A to B by some mechanism) + +in R2 on B: +$ git-bundle --verify mybundle +$ git-fetch mybundle refspec + +where refspec is refInBundle:localRef + + +Also, with something like this in your config: + +[remote "bundle"] + url = /home/me/tmp/file.bdl + fetch = refs/heads/*:refs/remotes/origin/* + +You can first sneakernet the bundle file to ~/tmp/file.bdl and +then these commands: + +$ git ls-remote bundle +$ git fetch bundle +$ git pull bundle + +would treat it as if it is talking with a remote side over the +network. + +Author +------ +Written by Mark Levedahl <mdl123@verizon.net> + +GIT +--- +Part of the gitlink:git[7] suite diff --git a/Makefile b/Makefile index ebecbbd..c6d540e 100644 --- a/Makefile +++ b/Makefile @@ -177,7 +177,8 @@ SCRIPT_SH = \ git-applymbox.sh git-applypatch.sh git-am.sh \ git-merge.sh git-merge-stupid.sh git-merge-octopus.sh \ git-merge-resolve.sh git-merge-ours.sh \ - git-lost-found.sh git-quiltimport.sh + git-lost-found.sh git-quiltimport.sh \ + git-bundle.sh SCRIPT_PERL = \ git-add--interactive.perl \ diff --git a/git-bundle.sh b/git-bundle.sh new file mode 100755 index 0000000..19ac137 --- /dev/null +++ b/git-bundle.sh @@ -0,0 +1,174 @@ +#!/bin/sh +# Basic handler for bundle files to connect repositories via sneakernet. +# Invocation must include action. +# This function can create a bundle or provide information on an existing bundle +# supporting git-fetch, git-pull, and git-ls-remote + +USAGE='[--create bundle <git-rev-list args>] | +[--verify|--list-heads|--unbundle bundle] <--tar tarspec> + where bundle is the name of the bundle file.' + +verify() { + # Check bundle version + test -f "$bfile" || die "cannot find $bfile" + test "$($TAR -xf ""$bfile"" -O version)" = "v1 git-bundle" || + die "$bfile doesn't look like a v1 bundle file." + + # do fast check, then if any prereqs are missing then go line by line + # to be verbose about the errors + prereqs=$($TAR xf "$bfile" -O prerequisites) + test -z "$prereqs" && return 0 + bad=$(echo "$prereqs" | cut -b-40 | git-rev-list --stdin --not --all 2>&1) + if test -n "$bad" ; then + test "$1" = "--silent" && return 1 + echo "error: $bfile requires the following commits you lack:" + echo "$prereqs" | + while read sha1 comment ; do + missing=$(git-rev-list $sha1 --not --all 2>&1) + test -n "$missing" && echo "$sha1 $comment" + done + exit 1 + fi + return 0 +} + +# list all or just a subset +list_heads() { + if test -z "$*" ; then + $TAR -xf "$bfile" -O references 2>/dev/null || exit 1 + else + ($TAR -xf "$bfile" -O references 2>/dev/null || exit 1) | + while read sha1 ref ; do + for arg in $* ; do + if test "${ref%$arg}" != "$ref" ; then + echo "$sha1 $ref" + break + fi + done + done + fi +} + +SUBDIRECTORY_OK=1 +. git-sh-setup + +args= +action= +while test -n "$1" ; do + case $1 in + --create|--list-heads|--unbundle|--verify) + action=${1#--} + shift + bfile=$1 + test -z "$bfile" && die "$action requires filename";; + --tar=*) + TAR=${1##--tar=};; + --tar) + shift + TAR=$1;; + *) + args="$args $1";; + esac + shift +done +test -z "$action" && die "No action given, what should I do?" + +# what tar to use, prefer gtar, then tar. +if test -z "$TAR" ; then + GTAR=$(which gtar 2>/dev/null) + TAR=${GTAR:-tar} +fi + +case $action in +create) + unknown=$(git-rev-parse --no-revs $args) + test -z "$unknown" || die "unknown option: $unknown" + gitrevargs=$(git-rev-parse --symbolic --revs-only $args) || exit 1 + + # find the refs to carry along and get sha1s for each. + refs= + fullrevargs= + for arg in $gitrevargs ; do + #ignore options and basis refs, get full ref name for things + # we will transport rejecting anything ambiguous (e.g., user + # gives master, have heads/master and remotes/origin/master, we + # keep the former). + case "$arg" in + -* | ^*) fullrevargs="$fullrevargs $arg";; + *) ref=$(git-show-ref "$arg") + test "$(echo $ref | wc -w)" = "2" || die "Ambigous reference: $arg +$ref" + fullrevargs="$fullrevargs ${ref#* }" + refs="$refs $ref";; + esac + done + test -z "$refs" && die "No references specified, I don't know what to bundle." + + # git-rev-list cannot determine edge objects if a date restriction is + # given... we do things a slow way if max-age or min-age are given + case "$fullrevargs" in + *--max-age* | *--min-age*) + # get a list of all commits that will be packed along with + # parents of each. A fixed git-rev-list --boundary should + # replace all of this. + echo "Finding prerequisites and commits to bundle..." + commits=$(git-rev-list $fullrevargs) + + # get immediate parents of each commit to include + parents= + for c in $commits ; do + parents="$parents $(git-rev-list --parents --max-count=1 $c | cut -b42-)" + done + parents=$(printf "%s\n" $parents | sort | uniq) + + # factor out what will be in this bundle, the remainder are the + # bundle's prerequisites. double up commits in this as we only + # want things that are only in parents to appear once + prereqs=$(printf "%s\n" $parents $commits $commits | \ + sort | uniq -c | sed -ne 's/^ *1 //p');; + *) + prereqs=$(git-rev-list --objects-edge $fullrevargs | sed -ne 's/^-//p');; + esac + + # replace prereqs with annotated version of same + + # create refs and pack + tmp="$GIT_DIR/bundle_tmp$$" + prerequisites="$tmp/prerequisites" + references="$tmp/references" + version="$tmp/version" + pack="$tmp/pack" + trap 'rm -rf "$tmp"' 0 1 2 3 15 + + mkdir "$tmp" && + echo "v1 git-bundle" > "$version" && + touch "$prerequisites" && + (for p in $prereqs ; do + git-rev-list --pretty=one --max-count=1 $p + done) > "$prerequisites" && + git-show-ref $refs > "$references" && + git-rev-list --objects $fullrevargs | cut -b-40 | + git pack-objects --all-progress --stdout > "$pack" && + + # create the tar file, clean up + cd "$tmp" && + tar cf bundle prerequisites references version pack && + cd - && + mv "$tmp/bundle" "$bfile" && + rm -rf "$tmp" + + # done + echo "Created $bfile";; + +verify) + verify && echo "$bfile is ok";; + +list-heads) + list_heads $args;; + +unbundle) + verify --silent || exit 1 + $TAR -xf "$bfile" -O pack | git-index-pack --stdin || + die "error: $bfile has a corrupted pack file" + list_heads $args;; +esac diff --git a/git-fetch.sh b/git-fetch.sh index ca984e7..42cc62f 100755 --- a/git-fetch.sh +++ b/git-fetch.sh @@ -377,8 +377,15 @@ fetch_main () { ( : subshell because we muck with IFS IFS=" $LF" ( - git-fetch-pack --thin $exec $keep $shallow_depth "$remote" $rref || - echo failed "$remote" + if test -f "$remote" ; then + test -n "$shallow_depth" && + die "shallow clone with bundle is not supported" + git-bundle --unbundle "$remote" $rref || + echo failed "$remote" + else + git-fetch-pack --thin $exec $keep $shallow_depth "$remote" $rref || + echo failed "$remote" + fi ) | ( trap ' diff --git a/git-ls-remote.sh b/git-ls-remote.sh index 8ea5c5e..28bb9b8 100755 --- a/git-ls-remote.sh +++ b/git-ls-remote.sh @@ -89,8 +89,13 @@ rsync://* ) ;; * ) - git-peek-remote $exec "$peek_repo" || + if test -f "$peek_repo" ; then + git bundle --list-heads "$peek_repo" || echo "failed slurping" + else + git-peek-remote $exec "$peek_repo" || + echo "failed slurping" + fi ;; esac | sort -t ' ' -k 2 | -- 1.5.0.rc4.375.gd0938-dirty ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive. 2007-02-18 22:47 [PATCH] Add git-bundle: move objects and references by archive Mark Levedahl @ 2007-02-19 1:07 ` Johannes Schindelin 2007-02-19 1:50 ` Junio C Hamano ` (2 more replies) 2007-02-19 1:49 ` Junio C Hamano 1 sibling, 3 replies; 42+ messages in thread From: Johannes Schindelin @ 2007-02-19 1:07 UTC (permalink / raw) To: Mark Levedahl; +Cc: git Hi, Sorry to be such a PITA, but I really, really think that it is wrong to make a tar dependency here. You said your cygwin has problems with binary files. Could you please try this: $ echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc If it returns anything else than "<tab>0<tab>1<tab>8", then your setup works differently from mine. I fit returns what I expect it to, then we can use cat directly and do not have to move the tar bloat around (it is inherently block based, so it wastes a lot of space, and it also stores other things we'll never use, like permissions for all files). On Sun, 18 Feb 2007, Mark Levedahl wrote: > diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl > old mode 100755 > new mode 100644 This is unintended, right? > diff --git a/git-bundle.sh b/git-bundle.sh > new file mode 100755 > index 0000000..19ac137 > --- /dev/null > +++ b/git-bundle.sh > @@ -0,0 +1,174 @@ > +#!/bin/sh > +# Basic handler for bundle files to connect repositories via sneakernet. > +# Invocation must include action. > +# This function can create a bundle or provide information on an existing bundle > +# supporting git-fetch, git-pull, and git-ls-remote > + > +USAGE='[--create bundle <git-rev-list args>] | > +[--verify|--list-heads|--unbundle bundle] <--tar tarspec> > + where bundle is the name of the bundle file.' We seem to prefer another (more consistent?) way to describe things: If the argument is not an option, we embrace it with "<>", for example: <bundle> if it is an optional option, we put it in "[]", for example: [--tar <tarfile>] and if one of several mutual exclusive options has to be passed, we put it in "()" delimited with "|". Also, we put no explanation in the output of "-h", since you should read the man page if you don't know what the options mean. So, it is git-bundle ( --create <bundle> [<rev-list-args>...] | --verify | --list-heads | --unbundle <bundle> ) [--tar <tarspec>] > +verify() { This function is nicely done, AFAICT! > +list_heads() { > + if test -z "$*" ; then > + $TAR -xf "$bfile" -O references 2>/dev/null || exit 1 > + else > + ($TAR -xf "$bfile" -O references 2>/dev/null || exit 1) | AFAICT this will not work as expected: "()" opens a subshell, and "exit 1" will only exit that subshell. Correct me if I'm wrong. > + while read sha1 ref ; do > + for arg in $* ; do > + if test "${ref%$arg}" != "$ref" ; then > + echo "$sha1 $ref" > + break > + fi > + done > + done > + fi > +} The complexity here is unnecessarily high (O(N*M)), but I guess we cannot do better in shell. Maybe constructing a regexp from the args, and piping the output from tar through a grep would help here. Dunno. Maybe it doesn't matter much in reality, anyway. > +SUBDIRECTORY_OK=1 > +. git-sh-setup We tend to write this part at the very top of a git script. > +while test -n "$1" ; do > + case $1 in > + --create|--list-heads|--unbundle|--verify) > + action=${1#--} > + shift > + bfile=$1 This can contain spaces (you are working on Windows, right? Windows users _love_ spaces in their filenames). > + test -z "$bfile" && die "$action requires filename";; > + --tar=*) > + TAR=${1##--tar=};; > + --tar) > + shift > + TAR=$1;; > + *) > + args="$args $1";; I know you want to mix the --tar option with the rev-list options (which I still find totally confusing, and unnecessary at best), but you _have_ to quote "$1" in the args=... part, then. Even refs can contain spaces these days. > +# what tar to use, prefer gtar, then tar. > +if test -z "$TAR" ; then > + GTAR=$(which gtar 2>/dev/null) Somebody on this list said that "which" should not be relied upon IIRC. I forgot why, but I obey that hint. > + gitrevargs=$(git-rev-parse --symbolic --revs-only $args) || exit 1 Here, you rely again on the refs not containing spaces. > + # git-rev-list cannot determine edge objects if a date restriction is > + # given... we do things a slow way if max-age or min-age are given Might make sense to teach max-age about boundary commits instead... > + prereqs=$(git-rev-list --objects-edge $fullrevargs | sed -ne 's/^-//p');; Since you are not interested in non-commits at all, why not use "--boundary" instead? > + # create refs and pack > + tmp="$GIT_DIR/bundle_tmp$$" Should we really rely on $GIT_DIR being writable? For unbundling, yes, but for bundling? Given the great confusion with git-status trying to write into $GIT_DIR, and people _demanding_ that it does not do so, I recommend against that. In the following part, refnames must not contain spaces again: > + (for p in $prereqs ; do > + git-rev-list --pretty=one --max-count=1 $p > + done) > "$prerequisites" && > + git-show-ref $refs > "$references" && > + git-rev-list --objects $fullrevargs | cut -b-40 | > + git pack-objects --all-progress --stdout > "$pack" && pack-objects can take rev-list arguments itself these days, using "--revs" and being piped the rev-list arguments. > +verify) > + verify && echo "$bfile is ok";; > + > +list-heads) > + list_heads $args;; I like this abstraction (defining a function, and calling that). Why don't you do it with all commands? Rest looks fine. Given the problems you had with "cat" on Cygwin, and the quoting stuff, I think it might make sense to make this a builtin right away, before the script stage (which is meant to speed things up, not slow them down). Ciao, Dscho ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive. 2007-02-19 1:07 ` Johannes Schindelin @ 2007-02-19 1:50 ` Junio C Hamano 2007-02-19 2:02 ` Johannes Schindelin 2007-02-19 7:56 ` Shawn O. Pearce 2007-02-19 13:29 ` Mark Levedahl 2 siblings, 1 reply; 42+ messages in thread From: Junio C Hamano @ 2007-02-19 1:50 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Mark Levedahl, git Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > Sorry to be such a PITA, but I really, really think that it is wrong to > make a tar dependency here. You said your cygwin has problems with binary > files. Could you please try this: > > $ echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc It might be just me, but "echo -e" makes me feel much more uneasy than explicitly saying "Sorry, we require GNU tar in this little corner of the system". >> + bfile=$1 > > This can contain spaces (you are working on Windows, right? Windows users > _love_ spaces in their filenames). Which is fine. Try var1=$var2 with something with SP or TAB in var2 and later try running $ echo "<$var1>" to see what happens. >> + gitrevargs=$(git-rev-parse --symbolic --revs-only $args) || exit 1 > > Here, you rely again on the refs not containing spaces. Which probably is fine, as refs cannot contain spaces ;-). >> + # git-rev-list cannot determine edge objects if a date restriction is >> + # given... we do things a slow way if max-age or min-age are given > > Might make sense to teach max-age about boundary commits instead... You are talking about a rather extensive change, but could be done. But that is a separate issue, I think. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive. 2007-02-19 1:50 ` Junio C Hamano @ 2007-02-19 2:02 ` Johannes Schindelin 0 siblings, 0 replies; 42+ messages in thread From: Johannes Schindelin @ 2007-02-19 2:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: Mark Levedahl, git Hi, On Sun, 18 Feb 2007, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > > Sorry to be such a PITA, but I really, really think that it is wrong to > > make a tar dependency here. You said your cygwin has problems with binary > > files. Could you please try this: > > > > $ echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc > > It might be just me, but "echo -e" makes me feel much more > uneasy than explicitly saying "Sorry, we require GNU tar in > this little corner of the system". Hey, this was meant as a _test_ on why it fails for Mark, not as the end result! > >> + bfile=$1 > > > > This can contain spaces (you are working on Windows, right? Windows users > > _love_ spaces in their filenames). > > Which is fine. Try var1=$var2 with something with SP or TAB in var2 > and later try running > > $ echo "<$var1>" > > to see what happens. Okay, I did not know that. > >> + gitrevargs=$(git-rev-parse --symbolic --revs-only $args) || exit 1 > > > > Here, you rely again on the refs not containing spaces. > > Which probably is fine, as refs cannot contain spaces ;-). Ah! As we allow so many things in refnames, I assumed that spaces are allowed also. > >> + # git-rev-list cannot determine edge objects if a date restriction is > >> + # given... we do things a slow way if max-age or min-age are given > > > > Might make sense to teach max-age about boundary commits instead... > > You are talking about a rather extensive change, but could be > done. But that is a separate issue, I think. I don't think it's so hard. Ciao, Dscho ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive. 2007-02-19 1:07 ` Johannes Schindelin 2007-02-19 1:50 ` Junio C Hamano @ 2007-02-19 7:56 ` Shawn O. Pearce 2007-02-19 13:29 ` Mark Levedahl 2 siblings, 0 replies; 42+ messages in thread From: Shawn O. Pearce @ 2007-02-19 7:56 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Mark Levedahl, git Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > On Sun, 18 Feb 2007, Mark Levedahl wrote: > > + gitrevargs=$(git-rev-parse --symbolic --revs-only $args) || exit 1 > > Here, you rely again on the refs not containing spaces. They cannot. OK, they can if you avoid any of the Git code: git rev-parse HEAD >".git/refs/heads/ref with space" but a lot of things downstream may get annoyed by doing this. Spaces are not permitted in ref names. > > + # create refs and pack > > + tmp="$GIT_DIR/bundle_tmp$$" > > Should we really rely on $GIT_DIR being writable? For unbundling, yes, but > for bundling? Given the great confusion with git-status trying to write > into $GIT_DIR, and people _demanding_ that it does not do so, I recommend > against that. I agree. For an application like bundle creation, $GIT_DIR should be strictly read-only. There is no reason for this application to need to create temporary files, as the bundle can be streamed to stdout (or optionally to a file supplied by the user on the command line). > > + git-rev-list --objects $fullrevargs | cut -b-40 | > > + git pack-objects --all-progress --stdout > "$pack" && > > pack-objects can take rev-list arguments itself these days, using "--revs" > and being piped the rev-list arguments. Uh, actually it can't. I thought it could too. It doesn't. It can only take a ref name or --not. I'm actually not sure why we limited it like that, but we did... It may be a good idea to change that restriction to not exist. -- Shawn. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive. 2007-02-19 1:07 ` Johannes Schindelin 2007-02-19 1:50 ` Junio C Hamano 2007-02-19 7:56 ` Shawn O. Pearce @ 2007-02-19 13:29 ` Mark Levedahl 2007-02-19 15:03 ` Johannes Schindelin 2 siblings, 1 reply; 42+ messages in thread From: Mark Levedahl @ 2007-02-19 13:29 UTC (permalink / raw) To: git "Johannes Schindelin" <Johannes.Schindelin@gmx.de> wrote in message news:Pine.LNX.4.63.0702190126220.22628@wbgn013.biozentrum.uni-wuerzburg.de... > Hi, > > Sorry to be such a PITA, but I really, really think that it is wrong > to > make a tar dependency here. You said your cygwin has problems with > binary > files. Could you please try this: > > $ echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc > Same result on Cygwin and FC6: ~>echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc 0 1 8 > If it returns anything else than "<tab>0<tab>1<tab>8", then your setup > works differently from mine. I fit returns what I expect it to, then > we > can use cat directly and do not have to move the tar bloat around (it > is > inherently block based, so it wastes a lot of space, and it also > stores > other things we'll never use, like permissions for all files). To repeat an earlier message, I tried the following: cat bundle | ( while read <header stuff> do <prcess header stuff> done git-pack-index --stdin ) and it worked, MOST of the time, most being the important word here. My memory is already fuzzy about the failure rate, but it was something like 2 of 7 bundle files I passed through resulting in git-pack-index reporting a corrupted pack file: when I manually edited out the header stuff and passed the remains of the file directly to git-pack-index things were fine. The same experiment worked perfectly on my FC6 system, so it was definitely some part of Cygwin, not the script. If git is worried about tickling "which" on ancient Solaris, I think it is a bit out of balance to simultaneously require a not yet written version of Cygwin (that will get fixed after someone has time to reduce this problem to a simple testcase and push back to the Cygwin maintainers). A bundle file *is* an archive. We can use an existing archiver or write a new one. I don't really favor writing one but that is personal preference for not reinventing the wheel, nothing more. However, given Cgywin's problems an archiver written in shell isn't going to work for binary files. We could uuencode / uudecode the pack files and make bundles pure ascii text. Or, we could use tar. Of course, rewriting this as a builtin offers the ability to write a purpose built archiver as well, but I was trying to avoid that. </unsolicited ignoreable gripe>Has anyone considered how much easier all of git would be if it were written in c + python, rather than c + every variant of shell + core utils + non-core utils known to mankind since the dawn of unix?</unsolicited ignorable gripe> Mark ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive. 2007-02-19 13:29 ` Mark Levedahl @ 2007-02-19 15:03 ` Johannes Schindelin 0 siblings, 0 replies; 42+ messages in thread From: Johannes Schindelin @ 2007-02-19 15:03 UTC (permalink / raw) To: Mark Levedahl; +Cc: git Hi, On Mon, 19 Feb 2007, Mark Levedahl wrote: > "Johannes Schindelin" <Johannes.Schindelin@gmx.de> wrote in message > news:Pine.LNX.4.63.0702190126220.22628@wbgn013.biozentrum.uni-wuerzburg.de... > > Hi, > > > > Sorry to be such a PITA, but I really, really think that it is wrong to > > make a tar dependency here. You said your cygwin has problems with binary > > files. Could you please try this: > > > > $ echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc > > > Same result on Cygwin and FC6: > ~>echo -ne '\x1a\x1b\x15\x10\0abc' | cat | wc > 0 1 8 Okay, so I really would like to have the bundle file from this experiment: > To repeat an earlier message, I tried the following: > > cat bundle | ( > while read <header stuff> do > <prcess header stuff> > done > git-pack-index --stdin > ) because I cannot see how the aforementioned "cat" can succeed, while pack-index --stdin cannot. > A bundle file *is* an archive. But the archive in the bundle really is the pack. You don't need any other archive there. You don't need to know the name of the prerequisites file, for example, if that information is in the header to begin with. > </unsolicited ignoreable gripe>Has anyone considered how much easier all > of git would be if it were written in c + python, rather than c + every > variant of shell + core utils + non-core utils known to mankind since > the dawn of unix?</unsolicited ignorable gripe> You realize that Python can be a PITA, a real PITA, when it comes to non-Linux systems? Just look at the fun we had with subprocess. It is included in Python 2.4, _works_ with Python 2.3, but _not_ Python with 2.2. If you now say "well, just upgrade!" then I SHOUT at you. There is _no_ excuse forcing users to upgrade when the failure is on _your_ part. Ciao, Dscho ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive. 2007-02-18 22:47 [PATCH] Add git-bundle: move objects and references by archive Mark Levedahl 2007-02-19 1:07 ` Johannes Schindelin @ 2007-02-19 1:49 ` Junio C Hamano 1 sibling, 0 replies; 42+ messages in thread From: Junio C Hamano @ 2007-02-19 1:49 UTC (permalink / raw) To: Mark Levedahl; +Cc: junkio, git Mark Levedahl <mdl123@verizon.net> writes: > Some workflows require use of repositories on machines that cannot be > connected, preventing use of git-fetch / git-push to transport objects > and references between the repositories. Looks much nicer. Thanks. > 6 files changed, 338 insertions(+), 4 deletions(-) > mode change 100755 => 100644 Documentation/cmd-list.perl Hmmmm? > +verify() { > + # Check bundle version > + test -f "$bfile" || die "cannot find $bfile" > + test "$($TAR -xf ""$bfile"" -O version)" = "v1 git-bundle" || > + die "$bfile doesn't look like a v1 bundle file." It makes me feel uneasy to see these double quotes next to each other. Are you sure you got your quoting right (I haven't checked)? Just a minor style issue: traditional single command letter to tar is easier to read if you do not write leading '-'. IOW, "tar xf" is preferred over "tar -xf". > + # do fast check, then if any prereqs are missing then go line by line > + # to be verbose about the errors > + prereqs=$($TAR xf "$bfile" -O prerequisites) > + test -z "$prereqs" && return 0 > + bad=$(echo "$prereqs" | cut -b-40 | git-rev-list --stdin --not --all 2>&1) > + if test -n "$bad" ; then > + test "$1" = "--silent" && return 1 > + echo "error: $bfile requires the following commits you lack:" > + echo "$prereqs" | > + while read sha1 comment ; do > + missing=$(git-rev-list $sha1 --not --all 2>&1) > + test -n "$missing" && echo "$sha1 $comment" > + done > + exit 1 > + fi This part is very hard to read. Maybe your tabstop is not set to 8? Linus rightly said "or something like that"; I do not think the empty output from "git-rev-list --stdin --not --all" is enough. Here, you are making sure that each line of $prereqs are names of objects available in your repository, and well connected to your existing refs. If one of the commits listed in $prereqs does not exist in your repository, I think rev-list exits with non-zero status, with a message to stderr, without spitting out anything to its stdout, so you would need: if bad=$(echo ... | rev-list ...) && test -z "$bad" then : happy else echo 2>&1 "error: you lack these..." fi Incidentally, I think you do not have to squelch error message from rev-list here. Other than that, I think Linus's check is correct and the above fills his "something like that" [*1*]. > +list_heads() { > + if test -z "$*" ; then > + $TAR -xf "$bfile" -O references 2>/dev/null || exit 1 > + else > + ($TAR -xf "$bfile" -O references 2>/dev/null || exit 1) | > + while read sha1 ref ; do > + for arg in $* ; do > + if test "${ref%$arg}" != "$ref" ; then > + echo "$sha1 $ref" > + break > + fi > + done > + done > + fi > +} I suspect that "exit 1" in the upstream of pipe does not have any effect on the outcome -- yup, I just tried... $ (tar xf no-such-file -O references || exit 1) | while read foo; do echo $foo; done ; echo $? 0 $ > +test -z "$action" && die "No action given, what should I do?" > + > +# what tar to use, prefer gtar, then tar. > +if test -z "$TAR" ; then > + GTAR=$(which gtar 2>/dev/null) > + TAR=${GTAR:-tar} > +fi "which" is Ok for interactive use, but never use it in scripts. I've seen "which foo" that says "No foo found." to its standard output without erroring out (I think it was older Solaris). +unbundle) + verify --silent || exit 1 + $TAR -xf "$bfile" -O pack | git-index-pack --stdin || + die "error: $bfile has a corrupted pack file" + list_heads $args;; +esac Hmm. Your verify () shell function silently fails if prereq is not met, so this makes the user of "git --unbundle file.bdl" would get silent failure and left wondering why the bundle does not unpack, doesn't it? > diff --git a/git-fetch.sh b/git-fetch.sh > index ca984e7..42cc62f 100755 > --- a/git-fetch.sh > +++ b/git-fetch.sh > @@ -377,8 +377,15 @@ fetch_main () { Nice. > diff --git a/git-ls-remote.sh b/git-ls-remote.sh > index 8ea5c5e..28bb9b8 100755 > --- a/git-ls-remote.sh > +++ b/git-ls-remote.sh > @@ -89,8 +89,13 @@ rsync://* ) Likewise. [footnote] *1* If somebody wants to know why it works... Suppose the "global" history looked like this: ---A---B---C and your receiving repository earlier had its tip of the branch at commit A. Then the user used git-http-fetch to fetch from a repository that had its tip at commit B. http-fetch (another "commit walker", local-fetch, works the same way) learns that the tip commit is B, downloads commit B, reads what's in there to learn the tree associated with commit B, fetches that tree, reads what's in there to learn the subtrees and blobs in it, ... The user can interrupt this process, at the point after commit B and everything that is needed to complete it was downloaded, but before it actually updated the ref to point at B. Suppose then you created a bundle with B..C to update the tip to C, requiring you to have B. "echo B | rev-list --not --all" would happily say "Oh, B is a commit that is not reachable from any of your refs", but in fact you can safely update the ref that currently point at A (because the user interrupted the earlier fetch) with C after checking C is a fast forward of A. This might make "outputs empty" look like a wrong check. However, the user could have interrupted the earlier http-fetch, at the point after commit B was downloaded but before finishing to download the tree associated with it. A commit walker does not update the ref, so the commit object B will be in your receiving repository but it is not complete (it lacks its tree object). Even in this case "echo B | rev-list --not --all" would say "B is ahead of your repository". That's why "outputs empty" is a good check. Before somebody wonders further. An alternative good check is to see if "echo B | rev-list --objects --not --all" does _not_ fail, and all of the objects listed in its output are available in the receiving repository. This will solve the apparent inefficiency in the case of killing http-fetch after it downloaded all the necessary objects before it updated the ref, which I described earlier. But I think that the window for that to happen is very small and if you killed a commit walker it is much more likely that you lack some objects, so doing this alternative check is optimizing for a wrong case. ^ permalink raw reply [flat|nested] 42+ messages in thread
[parent not found: <Pine.LNX.4.63.0702220157130.22628@wbgn013.biozentrum.uni-wuerz burg.de>]
* [PATCH] Add git-bundle: move objects and references by archive @ 2007-02-22 0:59 ` Johannes Schindelin 2007-02-22 3:28 ` Nicolas Pitre ` (2 more replies) 0 siblings, 3 replies; 42+ messages in thread From: Johannes Schindelin @ 2007-02-22 0:59 UTC (permalink / raw) To: git; +Cc: Mark Levedahl, junkio Some workflows require use of repositories on machines that cannot be connected, preventing use of git-fetch / git-push to transport objects and references between the repositories. git-bundle provides an alternate transport mechanism, effectively allowing git-fetch and git-pull to operate using sneakernet transport. `git-bundle create` allows the user to create a bundle containing one or more branches or tags, but with specified basis assumed to exist on the target repository. At the receiving end, git-bundle acts like git-fetch-pack, allowing the user to invoke git-fetch or git-pull using the bundle file as the URL. git-fetch and git-ls-remote determine they have a bundle URL by checking that the URL points to a file, but are otherwise unchanged in operation with bundles. The original patch was done by Mark Levedahl <mdl123@verizon.net>. It was updated to make git-bundle a builtin, and get rid of the tar format: now, the first line is supposed to say "# v2 git bundle", the next lines either contain a prerequisite ("-" followed by the hash of the needed commit), or a ref (the hash of a commit, followed by the name of the ref), and finally the pack. As a result, the bundle argument can be "-" now. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- This patch should apply cleanly to "next". Mark, could you test on cygwin? .gitignore | 1 + Documentation/cmd-list.perl | 1 + Documentation/git-bundle.txt | 139 +++++++++++++++ Makefile | 1 + builtin-bundle.c | 389 ++++++++++++++++++++++++++++++++++++++++++ builtin.h | 1 + git-fetch.sh | 7 + git-ls-remote.sh | 7 +- git.c | 1 + index-pack.c | 4 +- t/t5510-fetch.sh | 28 +++- 11 files changed, 575 insertions(+), 4 deletions(-) create mode 100644 Documentation/git-bundle.txt create mode 100644 builtin-bundle.c diff --git a/.gitignore b/.gitignore index f15155d..4d3c66d 100644 --- a/.gitignore +++ b/.gitignore @@ -13,6 +13,7 @@ git-archive git-bisect git-blame git-branch +git-bundle git-cat-file git-check-ref-format git-checkout diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl index a2d6268..f61c77a 100755 --- a/Documentation/cmd-list.perl +++ b/Documentation/cmd-list.perl @@ -70,6 +70,7 @@ git-archive mainporcelain git-bisect mainporcelain git-blame ancillaryinterrogators git-branch mainporcelain +git-bundle mainporcelain git-cat-file plumbinginterrogators git-checkout-index plumbingmanipulators git-checkout mainporcelain diff --git a/Documentation/git-bundle.txt b/Documentation/git-bundle.txt new file mode 100644 index 0000000..4ea9e85 --- /dev/null +++ b/Documentation/git-bundle.txt @@ -0,0 +1,139 @@ +git-bundle(1) +============= + +NAME +---- +git-bundle - Move objects and refs by archive + + +SYNOPSIS +-------- +'git-bundle' create <file> [git-rev-list args] +'git-bundle' verify <file> +'git-bundle' list-heads <file> [refname...] +'git-bundle' unbundle <file> [refname...] + +DESCRIPTION +----------- + +Some workflows require that one or more branches of development on one +machine be replicated on another machine, but the two machines cannot +be directly connected so the interactive git protocols (git, ssh, +rsync, http) cannot be used. This command provides suport for +git-fetch and git-pull to operate by packaging objects and references +in an archive at the originating machine, then importing those into +another repository using gitlink:git-fetch[1] and gitlink:git-pull[1] +after moving the archive by some means (i.e., by sneakernet). As no +direct connection between repositories exists, the user must specify a +basis for the bundle that is held by the destination repository: the +bundle assumes that all objects in the basis are already in the +destination repository. + +OPTIONS +------- + +create <file>:: + Used to create a bundle named 'file'. This requires the + git-rev-list arguments to define the bundle contents. + +verify <file>:: + Used to check that a bundle file is valid and will apply + cleanly to the current repository. This includes checks on the + bundle format itself as well as checking that the prerequisite + commits exist and are fully linked in the current repository. + git-bundle prints a list of missing commits, if any, and exits + with non-zero status. + +list-heads <file>:: + Lists the references defined in the bundle. If followed by a + list of references, only references matching those given are + printed out. + +unbundle <file>:: + Passes the objects in the bundle to gitlink:git-index-pack[1] + for storage in the repository, then prints the names of all + defined references. If a reflist is given, only references + matching those in the given list are printed. This command is + really plumbing, intended to be called only by + gitlink:git-fetch[1]. + +[git-rev-list-args...]:: + A list of arguments, accepatble to git-rev-parse and + git-rev-list, that specify the specific objects and references + to transport. For example, "master~10..master" causes the + current master reference to be packaged along with all objects + added since its 10th ancestor commit. There is no explicit + limit to the number of references and objects that may be + packaged. + + +[refname...]:: + A list of references used to limit the references reported as + available. This is principally of use to git-fetch, which + expects to recieve only those references asked for and not + necessarily everything in the pack (in this case, git-bundle is + acting like gitlink:git-fetch-pack[1]). + +SPECIFYING REFERENCES +--------------------- + +git-bundle will only package references that are shown by +git-show-ref: this includes heads, tags, and remote heads. References +such as master~1 cannot be packaged, but are perfectly suitable for +defining the basis. More than one reference may be packaged, and more +than one basis can be specified. The objects packaged are those not +contained in the union of the given bases. Each basis can be +specified explicitly (e.g., ^master~10), or implicitly (e.g., +master~10..master, master --since=10.days.ago). + +It is very important that the basis used be held by the destination. +It is ok to err on the side of conservatism, causing the bundle file +to contain objects already in the destination as these are ignored +when unpacking at the destination. + +EXAMPLE +------- + +Assume two repositories exist as R1 on machine A, and R2 on machine B. +For whatever reason, direct connection between A and B is not allowed, +but we can move data from A to B via some mechanism (CD, email, etc). +We want to update R2 with developments made on branch master in R1. +We set a tag in R1 (lastR2bundle) after the previous such transport, +and move it afterwards to help build the bundle. + +in R1 on A: +$ git-bundle create mybundle master ^lastR2bundle +$ git tag -f lastR2bundle master + +(move mybundle from A to B by some mechanism) + +in R2 on B: +$ git-bundle verify mybundle +$ git-fetch mybundle refspec + +where refspec is refInBundle:localRef + + +Also, with something like this in your config: + +[remote "bundle"] + url = /home/me/tmp/file.bdl + fetch = refs/heads/*:refs/remotes/origin/* + +You can first sneakernet the bundle file to ~/tmp/file.bdl and +then these commands: + +$ git ls-remote bundle +$ git fetch bundle +$ git pull bundle + +would treat it as if it is talking with a remote side over the +network. + +Author +------ +Written by Mark Levedahl <mdl123@verizon.net> + +GIT +--- +Part of the gitlink:git[7] suite diff --git a/Makefile b/Makefile index 56bb0b8..03a7f4a 100644 --- a/Makefile +++ b/Makefile @@ -276,6 +276,7 @@ BUILTIN_OBJS = \ builtin-archive.o \ builtin-blame.o \ builtin-branch.o \ + builtin-bundle.o \ builtin-cat-file.o \ builtin-checkout-index.o \ builtin-check-ref-format.o \ diff --git a/builtin-bundle.c b/builtin-bundle.c new file mode 100644 index 0000000..4bd596a --- /dev/null +++ b/builtin-bundle.c @@ -0,0 +1,389 @@ +#include "cache.h" +#include "object.h" +#include "commit.h" +#include "diff.h" +#include "revision.h" +#include "list-objects.h" +#include "exec_cmd.h" + +/* + * Basic handler for bundle files to connect repositories via sneakernet. + * Invocation must include action. + * This function can create a bundle or provide information on an existing + * bundle supporting git-fetch, git-pull, and git-ls-remote + */ + +static const char *bundle_usage="git-bundle (--create <bundle> <git-rev-list args> | --verify <bundle> | --list-heads <bundle> [refname]... | --unbundle <bundle> [refname]... )"; + +static const char bundle_signature[] = "# v2 git bundle\n"; + +struct ref_list { + unsigned int nr, alloc; + struct { + unsigned char sha1[20]; + char *name; + } *list; +}; + +static void add_to_ref_list(const unsigned char *sha1, const char *name, + struct ref_list *list) +{ + if (list->nr + 1 >= list->alloc) { + list->alloc = alloc_nr(list->nr + 1); + list->list = xrealloc(list->list, + list->alloc * sizeof(list->list[0])); + } + memcpy(list->list[list->nr].sha1, sha1, 20); + list->list[list->nr].name = xstrdup(name); + list->nr++; +} + +struct bundle_header { + struct ref_list prerequisites, references; +}; + +/* this function returns the length of the string */ +static int read_string(int fd, char *buffer, int size) +{ + int i; + for (i = 0; i < size - 1; i++) { + int count = read(fd, buffer + i, 1); + if (count < 0) + return error("Read error: %s", strerror(errno)); + if (count == 0) { + i--; + break; + } + if (buffer[i] == '\n') + break; + } + buffer[i + 1] = '\0'; + return i + 1; +} + +/* returns an fd */ +static int read_header(const char *path, struct bundle_header *header) { + char buffer[1024]; + int fd = open(path, O_RDONLY); + + if (fd < 0) + return error("could not open '%s'", path); + if (read_string(fd, buffer, sizeof(buffer)) < 0 || + strcmp(buffer, bundle_signature)) { + close(fd); + return error("'%s' does not look like a v2 bundle file", path); + } + while (read_string(fd, buffer, sizeof(buffer)) > 0 + && buffer[0] != '\n') { + int offset = buffer[0] == '-' ? 1 : 0; + int len = strlen(buffer); + unsigned char sha1[20]; + struct ref_list *list = offset ? &header->prerequisites + : &header->references; + if (get_sha1_hex(buffer + offset, sha1)) { + close(fd); + return error("invalid SHA1: %s", buffer); + } + if (buffer[len - 1] == '\n') + buffer[len - 1] = '\0'; + add_to_ref_list(sha1, buffer + 41 + offset, list); + } + return fd; +} + +/* if in && *in >= 0, take that as input file descriptor instead */ +static int fork_with_pipe(const char **argv, int *in, int *out) +{ + int needs_in, needs_out; + int fdin[2], fdout[2], pid; + + needs_in = in && *in < 0; + if (needs_in) { + if (pipe(fdin) < 0) + return error("could not setup pipe"); + *in = fdin[1]; + } + + needs_out = out && *out < 0; + if (needs_out) { + if (pipe(fdout) < 0) + return error("could not setup pipe"); + *out = fdout[0]; + } + + if ((pid = fork()) < 0) { + if (needs_in) { + close(fdin[0]); + close(fdin[1]); + } + if (needs_out) { + close(fdout[0]); + close(fdout[1]); + } + return error("could not fork"); + } + if (!pid) { + if (needs_in) { + dup2(fdin[0], 0); + close(fdin[0]); + close(fdin[1]); + } else if (in) + dup2(*in, 0); + if (needs_out) { + dup2(fdout[1], 1); + close(fdout[0]); + close(fdout[1]); + } else if (out) + dup2(*out, 1); + exit(execv_git_cmd(argv)); + } + if (needs_in) + close(fdin[0]); + if (needs_out) + close(fdout[1]); + return pid; +} + +static int verify_bundle(struct bundle_header *header) +{ + /* + * Do fast check, then if any prereqs are missing then go line by line + * to be verbose about the errors + */ + struct ref_list *p = &header->prerequisites; + const char *argv[5] = {"rev-list", "--stdin", "--not", "--all", NULL}; + int pid, in, out, i, ret = 0; + char buffer[1024]; + + in = out = -1; + pid = fork_with_pipe(argv, &in, &out); + if (pid < 0) + return error("Could not fork rev-list"); + if (!fork()) { + for (i = 0; i < p->nr; i++) { + write(in, sha1_to_hex(p->list[i].sha1), 40); + write(in, "\n", 1); + } + close(in); + exit(0); + } + close(in); + while (read_string(out, buffer, sizeof(buffer)) > 0) { + if (ret++ == 0) + error ("The bundle requires the following commits you lack:"); + fprintf(stderr, "%s", buffer); + } + close(out); + while (waitpid(pid, &i, 0) < 0) + if (errno != EINTR) + return -1; + if (!ret && (!WIFEXITED(i) || WEXITSTATUS(i))) + return error("At least one prerequisite is lacking."); + + return ret; +} + +static int list_heads(struct bundle_header *header, int argc, const char **argv) +{ + int i; + struct ref_list *r = &header->references; + + for (i = 0; i < r->nr; i++) { + if (argc > 1) { + int j; + for (j = 1; j < argc; j++) + if (!strcmp(r->list[i].name, argv[j])) + break; + if (j == argc) + continue; + } + printf("%s %s\n", sha1_to_hex(r->list[i].sha1), + r->list[i].name); + } + return 0; +} + +static void show_commit(struct commit *commit) +{ + write(1, sha1_to_hex(commit->object.sha1), 40); + write(1, "\n", 1); + if (commit->parents) { + free_commit_list(commit->parents); + commit->parents = NULL; + } +} + +static void show_object(struct object_array_entry *p) +{ + /* An object with name "foo\n0000000..." can be used to + * * confuse downstream git-pack-objects very badly. + * */ + const char *ep = strchr(p->name, '\n'); + int len = ep ? ep - p->name : strlen(p->name); + write(1, sha1_to_hex(p->item->sha1), 40); + write(1, " ", 1); + if (len) + write(1, p->name, len); + write(1, "\n", 1); +} + +static int create_bundle(struct bundle_header *header, const char *path, + int argc, const char **argv) +{ + int bundle_fd = -1; + const char **argv_boundary = xmalloc((argc + 3) * sizeof(const char *)); + const char **argv_pack = xmalloc(4 * sizeof(const char *)); + int pid, in, out, i, status; + char buffer[1024]; + struct rev_info revs; + + bundle_fd = (!strcmp(path, "-") ? 1 : + open(path, O_CREAT | O_WRONLY, 0666)); + if (bundle_fd < 0) + return error("Could not write to '%s'", path); + + /* write signature */ + write(bundle_fd, bundle_signature, strlen(bundle_signature)); + + /* write prerequisites */ + memcpy(argv_boundary + 2, argv + 1, argc * sizeof(const char *)); + argv_boundary[0] = "rev-list"; + argv_boundary[1] = "--boundary"; + argv_boundary[argc + 1] = NULL; + out = -1; + pid = fork_with_pipe(argv_boundary, NULL, &out); + if (pid < 0) + return -1; + while ((i = read_string(out, buffer, sizeof(buffer))) > 0) + if (buffer[0] == '-') + write(bundle_fd, buffer, i); + while ((i = waitpid(pid, &status, 0)) < 0) + if (errno != EINTR) + return error("rev-list died"); + if (!WIFEXITED(status) || WEXITSTATUS(status)) + return error("rev-list died %d", WEXITSTATUS(status)); + + /* write references */ + save_commit_buffer = 0; + init_revisions(&revs, NULL); + revs.tag_objects = 1; + revs.tree_objects = 1; + revs.blob_objects = 1; + argc = setup_revisions(argc, argv, &revs, NULL); + if (argc > 1) + return error("unrecognized argument: %s'", argv[1]); + for (i = 0; i < revs.pending.nr; i++) { + struct object_array_entry *e = revs.pending.objects + i; + if (!(e->item->flags & UNINTERESTING)) { + unsigned char sha1[20]; + char *ref; + if (dwim_ref(e->name, strlen(e->name), sha1, &ref) != 1) + continue; + write(bundle_fd, sha1_to_hex(e->item->sha1), 40); + write(bundle_fd, " ", 1); + write(bundle_fd, ref, strlen(ref)); + write(bundle_fd, "\n", 1); + free(ref); + } + } + + /* end header */ + write(bundle_fd, "\n", 1); + + /* write pack */ + argv_pack[0] = "pack-objects"; + argv_pack[1] = "--all-progress"; + argv_pack[2] = "--stdout"; + argv_pack[3] = NULL; + in = -1; + out = bundle_fd; + pid = fork_with_pipe(argv_pack, &in, &out); + if (pid < 0) + return error("Could not spawn pack-objects"); + close(1); + close(bundle_fd); + dup2(in, 1); + close(in); + prepare_revision_walk(&revs); + traverse_commit_list(&revs, show_commit, show_object); + close(1); + while (waitpid(pid, &status, 0) < 0) + if (errno != EINTR) + return -1; + if (!WIFEXITED(status) || WEXITSTATUS(status)) + return error ("pack-objects died"); + return 0; +} + +static int unbundle(struct bundle_header *header, int bundle_fd, + int argc, const char **argv) +{ + const char *argv_index_pack[] = {"index-pack", "--stdin", NULL}; + int pid, status, dev_null; + + if (verify_bundle(header)) + return -1; + dev_null = open("/dev/null", O_WRONLY); + pid = fork_with_pipe(argv_index_pack, &bundle_fd, &dev_null); + if (pid < 0) + return error("Could not spawn index-pack"); + close(bundle_fd); + while (waitpid(pid, &status, 0) < 0) + if (errno != EINTR) + return error("index-pack died"); + if (!WIFEXITED(status) || WEXITSTATUS(status)) + return error("index-pack exited with status %d", + WEXITSTATUS(status)); + return list_heads(header, argc, argv); +} + +int cmd_bundle(int argc, const char **argv, const char *prefix) +{ + struct bundle_header header; + int nongit = 0; + const char *cmd, *bundle_file; + int bundle_fd = -1; + char buffer[PATH_MAX]; + + if (argc < 3) + usage(bundle_usage); + + cmd = argv[1]; + bundle_file = argv[2]; + argc -= 2; + argv += 2; + + prefix = setup_git_directory_gently(&nongit); + if (prefix && bundle_file[0] != '/') { + snprintf(buffer, sizeof(buffer), "%s/%s", prefix, bundle_file); + bundle_file = buffer; + } + + memset(&header, 0, sizeof(header)); + if (strcmp(cmd, "create") && + !(bundle_fd = read_header(bundle_file, &header))) + return 1; + + if (!strcmp(cmd, "verify")) { + close(bundle_fd); + if (verify_bundle(&header)) + return 1; + fprintf(stderr, "%s is okay\n", bundle_file); + return 0; + } + if (!strcmp(cmd, "list-heads")) { + close(bundle_fd); + return !!list_heads(&header, argc, argv); + } + if (!strcmp(cmd, "create")) { + if (nongit) + die("Need a repository to create a bundle."); + return !!create_bundle(&header, bundle_file, argc, argv); + } else if (!strcmp(cmd, "unbundle")) { + if (nongit) + die("Need a repository to unbundle."); + return !!unbundle(&header, bundle_fd, argc, argv); + } else + usage(bundle_usage); +} + diff --git a/builtin.h b/builtin.h index 57e8741..528074b 100644 --- a/builtin.h +++ b/builtin.h @@ -19,6 +19,7 @@ extern int cmd_apply(int argc, const char **argv, const char *prefix); extern int cmd_archive(int argc, const char **argv, const char *prefix); extern int cmd_blame(int argc, const char **argv, const char *prefix); extern int cmd_branch(int argc, const char **argv, const char *prefix); +extern int cmd_bundle(int argc, const char **argv, const char *prefix); extern int cmd_cat_file(int argc, const char **argv, const char *prefix); extern int cmd_checkout_index(int argc, const char **argv, const char *prefix); extern int cmd_check_ref_format(int argc, const char **argv, const char *prefix); diff --git a/git-fetch.sh b/git-fetch.sh index 5d3fec0..5ae0d28 100755 --- a/git-fetch.sh +++ b/git-fetch.sh @@ -388,9 +388,16 @@ fetch_main () { ( : subshell because we muck with IFS IFS=" $LF" ( + if test -f "$remote" ; then + test -n "$shallow_depth" && + die "shallow clone with bundle is not supported" + git-bundle unbundle "$remote" $rref || + echo failed "$remote" + else git-fetch-pack --thin $exec $keep $shallow_depth $no_progress \ "$remote" $rref || echo failed "$remote" + fi ) | ( trap ' diff --git a/git-ls-remote.sh b/git-ls-remote.sh index 8ea5c5e..a6ed99a 100755 --- a/git-ls-remote.sh +++ b/git-ls-remote.sh @@ -89,8 +89,13 @@ rsync://* ) ;; * ) - git-peek-remote $exec "$peek_repo" || + if test -f "$peek_repo" ; then + git bundle list-heads "$peek_repo" || echo "failed slurping" + else + git-peek-remote $exec "$peek_repo" || + echo "failed slurping" + fi ;; esac | sort -t ' ' -k 2 | diff --git a/git.c b/git.c index 83f3d90..a184d47 100644 --- a/git.c +++ b/git.c @@ -229,6 +229,7 @@ static void handle_internal_command(int argc, const char **argv, char **envp) { "archive", cmd_archive }, { "blame", cmd_blame, RUN_SETUP }, { "branch", cmd_branch, RUN_SETUP }, + { "bundle", cmd_bundle }, { "cat-file", cmd_cat_file, RUN_SETUP }, { "checkout-index", cmd_checkout_index, RUN_SETUP }, { "check-ref-format", cmd_check_ref_format }, diff --git a/index-pack.c b/index-pack.c index fa9a0e7..5ccf4c4 100644 --- a/index-pack.c +++ b/index-pack.c @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1) /* If input_fd is a file, we should have reached its end now. */ if (fstat(input_fd, &st)) die("cannot fstat packfile: %s", strerror(errno)); - if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes) - die("pack has junk at the end"); + if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes) + die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd); if (!nr_deltas) return; diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh index 50c6485..fa76662 100755 --- a/t/t5510-fetch.sh +++ b/t/t5510-fetch.sh @@ -35,7 +35,9 @@ test_expect_success "clone and setup child repos" ' echo "URL: ../two/.git/" echo "Pull: refs/heads/master:refs/heads/two" echo "Pull: refs/heads/one:refs/heads/one" - } >.git/remotes/two + } >.git/remotes/two && + cd .. && + git clone . bundle ' test_expect_success "fetch test" ' @@ -81,4 +83,28 @@ test_expect_success 'fetch following tags' ' ' +test_expect_success 'create bundle 1' ' + cd "$D" && + echo >file updated again by origin && + git commit -a -m "tip" && + git bundle create bundle1 master^..master +' + +test_expect_success 'create bundle 2' ' + cd "$D" && + git bundle create bundle2 master~2..master +' + +test_expect_failure 'unbundle 1' ' + cd "$D/bundle" && + git checkout -b some-branch && + git fetch "$D/bundle1" master:master +' + +test_expect_success 'unbundle 2' ' + cd "$D/bundle" && + git fetch ../bundle2 master:master && + test "tip" = "$(git log -1 --pretty=oneline master | cut -b42-)" +' + test_done -- 1.5.0.1.616.gc6c4-dirty ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 0:59 ` Johannes Schindelin @ 2007-02-22 3:28 ` Nicolas Pitre 2007-02-22 15:55 ` Johannes Schindelin 2007-02-22 6:56 ` Junio C Hamano 2007-02-22 9:31 ` Johannes Sixt 2 siblings, 1 reply; 42+ messages in thread From: Nicolas Pitre @ 2007-02-22 3:28 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Mark Levedahl, Junio C Hamano On Thu, 22 Feb 2007, Johannes Schindelin wrote: > diff --git a/index-pack.c b/index-pack.c > index fa9a0e7..5ccf4c4 100644 > --- a/index-pack.c > +++ b/index-pack.c > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1) > /* If input_fd is a file, we should have reached its end now. */ > if (fstat(input_fd, &st)) > die("cannot fstat packfile: %s", strerror(errno)); > - if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > - die("pack has junk at the end"); > + if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > + die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd); > > if (!nr_deltas) > return; What is this supposed to mean? Nicolas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 3:28 ` Nicolas Pitre @ 2007-02-22 15:55 ` Johannes Schindelin 2007-02-22 16:14 ` Simon 'corecode' Schubert 2007-02-22 16:24 ` Nicolas Pitre 0 siblings, 2 replies; 42+ messages in thread From: Johannes Schindelin @ 2007-02-22 15:55 UTC (permalink / raw) To: Nicolas Pitre; +Cc: git, Mark Levedahl, Junio C Hamano Hi, On Wed, 21 Feb 2007, Nicolas Pitre wrote: > On Thu, 22 Feb 2007, Johannes Schindelin wrote: > > > diff --git a/index-pack.c b/index-pack.c > > index fa9a0e7..5ccf4c4 100644 > > --- a/index-pack.c > > +++ b/index-pack.c > > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1) > > /* If input_fd is a file, we should have reached its end now. */ > > if (fstat(input_fd, &st)) > > die("cannot fstat packfile: %s", strerror(errno)); > > - if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > > - die("pack has junk at the end"); > > + if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > > + die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd); > > > > if (!nr_deltas) > > return; > > What is this supposed to mean? The funny thing is, if you stream part of the bundle file to index-pack, S_ISREG(st.st_mode) is true, even if input_fd == 0. Then, because it is only part of the bundle file, the size check fails. Ciao, Dscho ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 15:55 ` Johannes Schindelin @ 2007-02-22 16:14 ` Simon 'corecode' Schubert 2007-02-22 16:28 ` Nicolas Pitre 2007-02-22 16:37 ` Johannes Schindelin 2007-02-22 16:24 ` Nicolas Pitre 1 sibling, 2 replies; 42+ messages in thread From: Simon 'corecode' Schubert @ 2007-02-22 16:14 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Nicolas Pitre, git, Mark Levedahl, Junio C Hamano [-- Attachment #1: Type: text/plain, Size: 1804 bytes --] Johannes Schindelin wrote: > Hi, > > On Wed, 21 Feb 2007, Nicolas Pitre wrote: > >> On Thu, 22 Feb 2007, Johannes Schindelin wrote: >> >>> diff --git a/index-pack.c b/index-pack.c >>> index fa9a0e7..5ccf4c4 100644 >>> --- a/index-pack.c >>> +++ b/index-pack.c >>> @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1) >>> /* If input_fd is a file, we should have reached its end now. */ >>> if (fstat(input_fd, &st)) >>> die("cannot fstat packfile: %s", strerror(errno)); >>> - if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes) >>> - die("pack has junk at the end"); >>> + if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes) >>> + die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd); >>> >>> if (!nr_deltas) >>> return; >> What is this supposed to mean? > > The funny thing is, if you stream part of the bundle file to index-pack, > S_ISREG(st.st_mode) is true, even if input_fd == 0. Well, of course: you opened a regular file and pass this as stdin to index-pack. Maybe something like this would be cleaner: if (IS_REF(st.st_mode) && lseek(input_fd, 0, SEEK_CUR) != st.st_size) die("..."); Because the point isn't actually that we consumed less bytes than the file's size, but that there is still data available after we are done with handling the pack. Anyways, is this a reason to die(), or rather just a remark? cheers simon -- Serve - BSD +++ RENT this banner advert +++ ASCII Ribbon /"\ Work - Mac +++ space for low €€€ NOW!1 +++ Campaign \ / Party Enjoy Relax | http://dragonflybsd.org Against HTML \ Dude 2c 2 the max ! http://golden-apple.biz Mail + News / \ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 252 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 16:14 ` Simon 'corecode' Schubert @ 2007-02-22 16:28 ` Nicolas Pitre 2007-02-22 16:37 ` Johannes Schindelin 1 sibling, 0 replies; 42+ messages in thread From: Nicolas Pitre @ 2007-02-22 16:28 UTC (permalink / raw) To: Simon 'corecode' Schubert Cc: Johannes Schindelin, git, Mark Levedahl, Junio C Hamano On Thu, 22 Feb 2007, Simon 'corecode' Schubert wrote: > Johannes Schindelin wrote: > > Hi, > > > > On Wed, 21 Feb 2007, Nicolas Pitre wrote: > > > > > On Thu, 22 Feb 2007, Johannes Schindelin wrote: > > > > > > > diff --git a/index-pack.c b/index-pack.c > > > > index fa9a0e7..5ccf4c4 100644 > > > > --- a/index-pack.c > > > > +++ b/index-pack.c > > > > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1) > > > > /* If input_fd is a file, we should have reached its end now. */ > > > > if (fstat(input_fd, &st)) > > > > die("cannot fstat packfile: %s", strerror(errno)); > > > > - if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > > > > - die("pack has junk at the end"); > > > > + if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > > > > + die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, > > > > (int)st.st_size, (int)consumed_bytes, input_fd); > > > > if (!nr_deltas) > > > > return; > > > What is this supposed to mean? > > > > The funny thing is, if you stream part of the bundle file to index-pack, > > S_ISREG(st.st_mode) is true, even if input_fd == 0. > > Well, of course: you opened a regular file and pass this as stdin to > index-pack. > > Maybe something like this would be cleaner: > > if (IS_REF(st.st_mode) && lseek(input_fd, 0, SEEK_CUR) != st.st_size) No that won't work because the input is buffered in the fill() function. Nicolas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 16:14 ` Simon 'corecode' Schubert 2007-02-22 16:28 ` Nicolas Pitre @ 2007-02-22 16:37 ` Johannes Schindelin 2007-02-22 16:46 ` Simon 'corecode' Schubert 1 sibling, 1 reply; 42+ messages in thread From: Johannes Schindelin @ 2007-02-22 16:37 UTC (permalink / raw) To: Simon 'corecode' Schubert Cc: Nicolas Pitre, git, Mark Levedahl, Junio C Hamano Hi, On Thu, 22 Feb 2007, Simon 'corecode' Schubert wrote: > Johannes Schindelin wrote: > > Hi, > > > > On Wed, 21 Feb 2007, Nicolas Pitre wrote: > > > > > On Thu, 22 Feb 2007, Johannes Schindelin wrote: > > > > > > > diff --git a/index-pack.c b/index-pack.c > > > > index fa9a0e7..5ccf4c4 100644 > > > > --- a/index-pack.c > > > > +++ b/index-pack.c > > > > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1) > > > > /* If input_fd is a file, we should have reached its end now. */ > > > > if (fstat(input_fd, &st)) > > > > die("cannot fstat packfile: %s", strerror(errno)); > > > > - if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > > > > - die("pack has junk at the end"); > > > > + if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > > > > + die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, > > > > (int)st.st_size, (int)consumed_bytes, input_fd); > > > > if (!nr_deltas) > > > > return; > > > What is this supposed to mean? > > > > The funny thing is, if you stream part of the bundle file to index-pack, > > S_ISREG(st.st_mode) is true, even if input_fd == 0. > > Well, of course: you opened a regular file and pass this as stdin to > index-pack. > > Maybe something like this would be cleaner: > > if (IS_REF(st.st_mode) && lseek(input_fd, 0, SEEK_CUR) != st.st_size) > die("..."); The lseek would fail whenever the input is _not_ a file, dying. Since index-pack is called from fetch-pack, with a socket instead of a file, it would fail for the most common user. My patch was only a minimal fixup to allow "--stdin" even starting in the middle of a file, and it does not break fetching. Ciao, Dscho ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 16:37 ` Johannes Schindelin @ 2007-02-22 16:46 ` Simon 'corecode' Schubert 2007-02-22 17:09 ` Johannes Schindelin 0 siblings, 1 reply; 42+ messages in thread From: Simon 'corecode' Schubert @ 2007-02-22 16:46 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Nicolas Pitre, git, Mark Levedahl, Junio C Hamano [-- Attachment #1: Type: text/plain, Size: 2048 bytes --] Johannes Schindelin wrote: > Hi, > > On Thu, 22 Feb 2007, Simon 'corecode' Schubert wrote: > >> Johannes Schindelin wrote: >>> Hi, >>> >>> On Wed, 21 Feb 2007, Nicolas Pitre wrote: >>> >>>> On Thu, 22 Feb 2007, Johannes Schindelin wrote: >>>> >>>>> diff --git a/index-pack.c b/index-pack.c >>>>> index fa9a0e7..5ccf4c4 100644 >>>>> --- a/index-pack.c >>>>> +++ b/index-pack.c >>>>> @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1) >>>>> /* If input_fd is a file, we should have reached its end now. */ >>>>> if (fstat(input_fd, &st)) >>>>> die("cannot fstat packfile: %s", strerror(errno)); >>>>> - if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes) >>>>> - die("pack has junk at the end"); >>>>> + if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes) >>>>> + die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, >>>>> (int)st.st_size, (int)consumed_bytes, input_fd); >>>>> if (!nr_deltas) >>>>> return; >>>> What is this supposed to mean? >>> The funny thing is, if you stream part of the bundle file to index-pack, >>> S_ISREG(st.st_mode) is true, even if input_fd == 0. >> Well, of course: you opened a regular file and pass this as stdin to >> index-pack. >> >> Maybe something like this would be cleaner: >> >> if (IS_REF(st.st_mode) && lseek(input_fd, 0, SEEK_CUR) != st.st_size) >> die("..."); > > The lseek would fail whenever the input is _not_ a file, dying. Since > index-pack is called from fetch-pack, with a socket instead of a file, it > would fail for the most common user. that's why i put IS_REF (should read IS_REG) there. but as nicolas pointed out, this won't work for read-ahead. cheers simon -- Serve - BSD +++ RENT this banner advert +++ ASCII Ribbon /"\ Work - Mac +++ space for low €€€ NOW!1 +++ Campaign \ / Party Enjoy Relax | http://dragonflybsd.org Against HTML \ Dude 2c 2 the max ! http://golden-apple.biz Mail + News / \ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 252 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 16:46 ` Simon 'corecode' Schubert @ 2007-02-22 17:09 ` Johannes Schindelin 0 siblings, 0 replies; 42+ messages in thread From: Johannes Schindelin @ 2007-02-22 17:09 UTC (permalink / raw) To: Simon 'corecode' Schubert Cc: Nicolas Pitre, git, Mark Levedahl, Junio C Hamano Hi, On Thu, 22 Feb 2007, Simon 'corecode' Schubert wrote: > Johannes Schindelin wrote: > > > > On Thu, 22 Feb 2007, Simon 'corecode' Schubert wrote: > > > > > Maybe something like this would be cleaner: > > > > > > if (IS_REF(st.st_mode) && lseek(input_fd, 0, SEEK_CUR) != st.st_size) > > > die("..."); > > > > The lseek would fail whenever the input is _not_ a file, dying. Since > > index-pack is called from fetch-pack, with a socket instead of a file, it > > would fail for the most common user. > > that's why i put IS_REF (should read IS_REG) there. Ah, yes, I understand. > but as nicolas pointed out, this won't work for read-ahead. Well, you could do this: if (S_ISREG(st.st_mode) && lseek(input_fd, 0, SEEK_CUR) - input_len != st.st_size) die("..."); Of course, this suggests that a file containing a pack must not have _anything_ after the pack. But AFAIR the reading part chokes on "garbage after pack" anyway. Ciao, Dscho ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 15:55 ` Johannes Schindelin 2007-02-22 16:14 ` Simon 'corecode' Schubert @ 2007-02-22 16:24 ` Nicolas Pitre 2007-02-22 17:12 ` Johannes Schindelin 1 sibling, 1 reply; 42+ messages in thread From: Nicolas Pitre @ 2007-02-22 16:24 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Mark Levedahl, Junio C Hamano On Thu, 22 Feb 2007, Johannes Schindelin wrote: > Hi, > > On Wed, 21 Feb 2007, Nicolas Pitre wrote: > > > On Thu, 22 Feb 2007, Johannes Schindelin wrote: > > > > > diff --git a/index-pack.c b/index-pack.c > > > index fa9a0e7..5ccf4c4 100644 > > > --- a/index-pack.c > > > +++ b/index-pack.c > > > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1) > > > /* If input_fd is a file, we should have reached its end now. */ > > > if (fstat(input_fd, &st)) > > > die("cannot fstat packfile: %s", strerror(errno)); > > > - if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > > > - die("pack has junk at the end"); > > > + if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > > > + die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd); > > > > > > if (!nr_deltas) > > > return; > > > > What is this supposed to mean? > > The funny thing is, if you stream part of the bundle file to index-pack, > S_ISREG(st.st_mode) is true, even if input_fd == 0. Hmmmm. indeed.. Could you please make the test, including the call to fstat(), conditional on !from_stdin instead? Also I don't see the point of displaying the mode since we know that S_ISREG(st.st_mode) is true, and input_fd is of little interest as well. Nicolas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 16:24 ` Nicolas Pitre @ 2007-02-22 17:12 ` Johannes Schindelin 2007-02-22 17:21 ` Nicolas Pitre 0 siblings, 1 reply; 42+ messages in thread From: Johannes Schindelin @ 2007-02-22 17:12 UTC (permalink / raw) To: Nicolas Pitre; +Cc: git, Mark Levedahl, Junio C Hamano Hi, On Thu, 22 Feb 2007, Nicolas Pitre wrote: > On Thu, 22 Feb 2007, Johannes Schindelin wrote: > > > Hi, > > > > On Wed, 21 Feb 2007, Nicolas Pitre wrote: > > > > > On Thu, 22 Feb 2007, Johannes Schindelin wrote: > > > > > > > diff --git a/index-pack.c b/index-pack.c > > > > index fa9a0e7..5ccf4c4 100644 > > > > --- a/index-pack.c > > > > +++ b/index-pack.c > > > > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1) > > > > /* If input_fd is a file, we should have reached its end now. */ > > > > if (fstat(input_fd, &st)) > > > > die("cannot fstat packfile: %s", strerror(errno)); > > > > - if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > > > > - die("pack has junk at the end"); > > > > + if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > > > > + die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd); > > > > > > > > if (!nr_deltas) > > > > return; > > > > > > What is this supposed to mean? > > > > The funny thing is, if you stream part of the bundle file to index-pack, > > S_ISREG(st.st_mode) is true, even if input_fd == 0. > > Hmmmm. indeed.. > > Could you please make the test, including the call to fstat(), > conditional on !from_stdin instead? How about using Simon's idea instead (subtracting input_len)? > Also I don't see the point of displaying the mode since we know that > S_ISREG(st.st_mode) is true, and input_fd is of little interest as well. As you probably guessed, these are debug remnants. Will fix. Ciao, Dscho ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 17:12 ` Johannes Schindelin @ 2007-02-22 17:21 ` Nicolas Pitre 0 siblings, 0 replies; 42+ messages in thread From: Nicolas Pitre @ 2007-02-22 17:21 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Mark Levedahl, Junio C Hamano On Thu, 22 Feb 2007, Johannes Schindelin wrote: > On Thu, 22 Feb 2007, Nicolas Pitre wrote: > > > Could you please make the test, including the call to fstat(), > > conditional on !from_stdin instead? > > How about using Simon's idea instead (subtracting input_len)? Yes it would work. Nicolas ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 0:59 ` Johannes Schindelin 2007-02-22 3:28 ` Nicolas Pitre @ 2007-02-22 6:56 ` Junio C Hamano 2007-02-22 7:08 ` Junio C Hamano ` (2 more replies) 2007-02-22 9:31 ` Johannes Sixt 2 siblings, 3 replies; 42+ messages in thread From: Junio C Hamano @ 2007-02-22 6:56 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Mark Levedahl, junkio Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > It was updated to make git-bundle a builtin, and get rid of the tar > format: now, the first line is supposed to say "# v2 git bundle", the next > lines either contain a prerequisite ("-" followed by the hash of the > needed commit), or a ref (the hash of a commit, followed by the name of > the ref), and finally the pack. As a result, the bundle argument can be > "-" now. That in BNF would be? bundle = signature prereq* ref* pack signature = "# v2 git bundle\n" prereq = "-" sha1 " " "\n" ref = sha1 " " name "\n" sha1 = [0-9a-f]{40} name = <name of ref> pack = <output stream from "pack-objects --stdout"> I suspect that we might want to possibly: - have checksum protection over the part before pack data? - give descriptive name to prereq, so that an error message can say "you need v1.4.4" instead of "you need e267c2f6"? - make the fields extensible without updating "v2"? > diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl > index a2d6268..f61c77a 100755 > --- a/Documentation/cmd-list.perl > +++ b/Documentation/cmd-list.perl > @@ -70,6 +70,7 @@ git-archive mainporcelain > git-bisect mainporcelain > git-blame ancillaryinterrogators > git-branch mainporcelain > +git-bundle mainporcelain > git-cat-file plumbinginterrogators > git-checkout-index plumbingmanipulators > git-checkout mainporcelain Is this really a mainporcelain? I would say ancillarymanipulators (or perhaps synchelpers). > diff --git a/Documentation/git-bundle.txt b/Documentation/git-bundle.txt > new file mode 100644 > index 0000000..4ea9e85 > ... > +Author > +------ > +Written by Mark Levedahl <mdl123@verizon.net> ... and Dscho. > diff --git a/builtin-bundle.c b/builtin-bundle.c > new file mode 100644 > index 0000000..4bd596a > --- /dev/null > +++ b/builtin-bundle.c > ... > +static const char *bundle_usage="git-bundle (--create <bundle> <git-rev-list args> | --verify <bundle> | --list-heads <bundle> [refname]... | --unbundle <bundle> [refname]... )"; I thought you removed "--" prefixes from subcommands. > +struct bundle_header { > + struct ref_list prerequisites, references; > +}; (minor style) I find these two members each on its own line easier to read, as in: struct bundle_header { struct ref_list prerequisites; struct ref_list references; }; > +/* this function returns the length of the string */ > +static int read_string(int fd, char *buffer, int size) > +{ > + int i; > + for (i = 0; i < size - 1; i++) { > + int count = read(fd, buffer + i, 1); > + if (count < 0) > + return error("Read error: %s", strerror(errno)); > + if (count == 0) { > + i--; > + break; > + } > + if (buffer[i] == '\n') > + break; > + } > + buffer[i + 1] = '\0'; > + return i + 1; > +} At least xread() please. I know the reason you read one byte at a time is because you cannot over-read into pack area, but somehow this makes me go "hmmmmmm". It's not performance critical so let's leave it that way. Is bundle supposed to be streamable? Otherwise we could probably seek back. > +/* returns an fd */ > +static int read_header(const char *path, struct bundle_header *header) { > + char buffer[1024]; > + int fd = open(path, O_RDONLY); > + > + if (fd < 0) > + return error("could not open '%s'", path); > + if (read_string(fd, buffer, sizeof(buffer)) < 0 || > + strcmp(buffer, bundle_signature)) { > + close(fd); > + return error("'%s' does not look like a v2 bundle file", path); > + } > + while (read_string(fd, buffer, sizeof(buffer)) > 0 > + && buffer[0] != '\n') { > + int offset = buffer[0] == '-' ? 1 : 0; > + int len = strlen(buffer); > + unsigned char sha1[20]; > + struct ref_list *list = offset ? &header->prerequisites > + : &header->references; > + if (get_sha1_hex(buffer + offset, sha1)) { > + close(fd); > + return error("invalid SHA1: %s", buffer); > + } > + if (buffer[len - 1] == '\n') > + buffer[len - 1] = '\0'; > + add_to_ref_list(sha1, buffer + 41 + offset, list); > + } > + return fd; > +} Don't you want to look at and validate buffer[40+offset] in any way, other than what get_sha1_hex() does (which is issspace())? > +/* if in && *in >= 0, take that as input file descriptor instead */ > +static int fork_with_pipe(const char **argv, int *in, int *out) > +{ > + int needs_in, needs_out; > + int fdin[2], fdout[2], pid; > + > + needs_in = in && *in < 0; > + if (needs_in) { > + if (pipe(fdin) < 0) > + return error("could not setup pipe"); > + *in = fdin[1]; > + } > + > + needs_out = out && *out < 0; > + if (needs_out) { > + if (pipe(fdout) < 0) > + return error("could not setup pipe"); > + *out = fdout[0]; > + } > + > + if ((pid = fork()) < 0) { > + if (needs_in) { > + close(fdin[0]); > + close(fdin[1]); > + } > + if (needs_out) { > + close(fdout[0]); > + close(fdout[1]); > + } > + return error("could not fork"); > + } > + if (!pid) { > + if (needs_in) { > + dup2(fdin[0], 0); > + close(fdin[0]); > + close(fdin[1]); > + } else if (in) > + dup2(*in, 0); > + if (needs_out) { > + dup2(fdout[1], 1); > + close(fdout[0]); > + close(fdout[1]); > + } else if (out) > + dup2(*out, 1); > + exit(execv_git_cmd(argv)); > + } Do you need to close *in and *out that are given by the caller after dup2() in the child? > +static int verify_bundle(struct bundle_header *header) > +{ > + /* > + * Do fast check, then if any prereqs are missing then go line by line > + * to be verbose about the errors > + */ > + struct ref_list *p = &header->prerequisites; > + const char *argv[5] = {"rev-list", "--stdin", "--not", "--all", NULL}; > + int pid, in, out, i, ret = 0; > + char buffer[1024]; > + > + in = out = -1; > + pid = fork_with_pipe(argv, &in, &out); > + if (pid < 0) > + return error("Could not fork rev-list"); > + if (!fork()) { > + for (i = 0; i < p->nr; i++) { > + write(in, sha1_to_hex(p->list[i].sha1), 40); > + write(in, "\n", 1); > + } > + close(in); > + exit(0); > + } > + close(in); What if write() fails? That can happen when one of the objects you feed here, or its parent objects, is missing from your repository -- receiving rev-list would die() and the writing child would sigpipe. I also wonder who waits for this child... > + while (read_string(out, buffer, sizeof(buffer)) > 0) { > + if (ret++ == 0) > + error ("The bundle requires the following commits you lack:"); > + fprintf(stderr, "%s", buffer); > + } > + close(out); I do not think this error message would buy you anything. If you say: echo commit | rev-list --stdin --not --all and you got commit back, that means you _do_ have that commit, and that commit reaches some ref you have (objects associated to that commit may be still missing). If commit is truly missing, then you do not get commit output from rev-list as it cannot even determine if it is "--not --all" or not -- it would error out, which is the more important check to see if prereqs are lacking. > + while (waitpid(pid, &i, 0) < 0) > + if (errno != EINTR) > + return -1; > + if (!ret && (!WIFEXITED(i) || WEXITSTATUS(i))) > + return error("At least one prerequisite is lacking."); So you would want to keep this error, perhaps even make it more helpful by suggesting which ones are missing. > +static int list_heads(struct bundle_header *header, int argc, const char **argv) > +{ > + int i; > + struct ref_list *r = &header->references; > + > + for (i = 0; i < r->nr; i++) { > + if (argc > 1) { > + int j; > + for (j = 1; j < argc; j++) > + if (!strcmp(r->list[i].name, argv[j])) > + break; > + if (j == argc) > + continue; > + } > + printf("%s %s\n", sha1_to_hex(r->list[i].sha1), > + r->list[i].name); > + } > + return 0; > +} I wonder if we want to have a way to list prereqs in similar way. > +static void show_commit(struct commit *commit) > +{ > + write(1, sha1_to_hex(commit->object.sha1), 40); > + write(1, "\n", 1); > + if (commit->parents) { > + free_commit_list(commit->parents); > + commit->parents = NULL; > + } > +} (everywhere) We would want write_in_full() with error handling that dies even on EPIPE. > + > +static void show_object(struct object_array_entry *p) > +{ > + /* An object with name "foo\n0000000..." can be used to > + * * confuse downstream git-pack-objects very badly. > + * */ An interesting way to wrap comments. > + /* write prerequisites */ > + memcpy(argv_boundary + 2, argv + 1, argc * sizeof(const char *)); > + argv_boundary[0] = "rev-list"; > + argv_boundary[1] = "--boundary"; > + argv_boundary[argc + 1] = NULL; > + out = -1; > + pid = fork_with_pipe(argv_boundary, NULL, &out); > + if (pid < 0) > + return -1; > + while ((i = read_string(out, buffer, sizeof(buffer))) > 0) > + if (buffer[0] == '-') > + write(bundle_fd, buffer, i); It would be helpful for the recipient if you can append output from git-describe (or name-rev) when the buffer lacks "name". > +static int unbundle(struct bundle_header *header, int bundle_fd, > + int argc, const char **argv) > +{ > + const char *argv_index_pack[] = {"index-pack", "--stdin", NULL}; > + int pid, status, dev_null; > + > + if (verify_bundle(header)) > + return -1; > + dev_null = open("/dev/null", O_WRONLY); > + pid = fork_with_pipe(argv_index_pack, &bundle_fd, &dev_null); > + if (pid < 0) > + return error("Could not spawn index-pack"); > + close(bundle_fd); > + while (waitpid(pid, &status, 0) < 0) > + if (errno != EINTR) > + return error("index-pack died"); > + if (!WIFEXITED(status) || WEXITSTATUS(status)) > + return error("index-pack exited with status %d", > + WEXITSTATUS(status)); > + return list_heads(header, argc, argv); > +} We might want to later use --keep for this as well... > diff --git a/index-pack.c b/index-pack.c > index fa9a0e7..5ccf4c4 100644 > --- a/index-pack.c > +++ b/index-pack.c > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1) > /* If input_fd is a file, we should have reached its end now. */ > if (fstat(input_fd, &st)) > die("cannot fstat packfile: %s", strerror(errno)); > - if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > - die("pack has junk at the end"); > + if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > + die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd); > > if (!nr_deltas) > return; ?? ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 6:56 ` Junio C Hamano @ 2007-02-22 7:08 ` Junio C Hamano 2007-02-22 16:20 ` Johannes Schindelin 2007-02-22 16:17 ` Johannes Schindelin 2007-02-23 2:32 ` Mark Levedahl 2 siblings, 1 reply; 42+ messages in thread From: Junio C Hamano @ 2007-02-22 7:08 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Mark Levedahl, junkio Junio C Hamano <junkio@cox.net> writes: >> +static int verify_bundle(struct bundle_header *header) >> +{ >> + /* >> + * Do fast check, then if any prereqs are missing then go line by line >> + * to be verbose about the errors >> + */ >> + struct ref_list *p = &header->prerequisites; >> + const char *argv[5] = {"rev-list", "--stdin", "--not", "--all", NULL}; >> + int pid, in, out, i, ret = 0; >> + char buffer[1024]; >> + >> + in = out = -1; >> + pid = fork_with_pipe(argv, &in, &out); >> + if (pid < 0) >> + return error("Could not fork rev-list"); >> + if (!fork()) { >> + for (i = 0; i < p->nr; i++) { >> + write(in, sha1_to_hex(p->list[i].sha1), 40); >> + write(in, "\n", 1); >> + } >> + close(in); >> + exit(0); >> + } >> + close(in); > > What if write() fails? That can happen when one of the objects > you feed here, or its parent objects, is missing from your > repository -- receiving rev-list would die() and the writing > child would sigpipe. > > I also wonder who waits for this child... In general, fork() to avoid bidirectional pipe deadlock is a good discipline, but in this particular case I think it would be easier to handle errors if you don't (and it would save another process). The other side "rev-list --stdin --not --all" is running a limited traversal, and would not emit anything until you stop feeding it from --stdin, or until it dies because you fed it a commit that does not exist. So as long as you check the error condition from write() for EPIPE to notice the other end died, I think you are Ok. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 7:08 ` Junio C Hamano @ 2007-02-22 16:20 ` Johannes Schindelin 2007-02-22 19:10 ` Junio C Hamano 0 siblings, 1 reply; 42+ messages in thread From: Johannes Schindelin @ 2007-02-22 16:20 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Mark Levedahl Hi, On Wed, 21 Feb 2007, Junio C Hamano wrote: > Junio C Hamano <junkio@cox.net> writes: > > >> +static int verify_bundle(struct bundle_header *header) > >> +{ > >> + /* > >> + * Do fast check, then if any prereqs are missing then go line by line > >> + * to be verbose about the errors > >> + */ > >> + struct ref_list *p = &header->prerequisites; > >> + const char *argv[5] = {"rev-list", "--stdin", "--not", "--all", NULL}; > >> + int pid, in, out, i, ret = 0; > >> + char buffer[1024]; > >> + > >> + in = out = -1; > >> + pid = fork_with_pipe(argv, &in, &out); > >> + if (pid < 0) > >> + return error("Could not fork rev-list"); > >> + if (!fork()) { > >> + for (i = 0; i < p->nr; i++) { > >> + write(in, sha1_to_hex(p->list[i].sha1), 40); > >> + write(in, "\n", 1); > >> + } > >> + close(in); > >> + exit(0); > >> + } > >> + close(in); > > > > What if write() fails? That can happen when one of the objects > > you feed here, or its parent objects, is missing from your > > repository -- receiving rev-list would die() and the writing > > child would sigpipe. > > > > I also wonder who waits for this child... > > In general, fork() to avoid bidirectional pipe deadlock is a > good discipline, but in this particular case I think it would be > easier to handle errors if you don't (and it would save another > process). The other side "rev-list --stdin --not --all" is > running a limited traversal, and would not emit anything until > you stop feeding it from --stdin, or until it dies because you > fed it a commit that does not exist. So as long as you check > the error condition from write() for EPIPE to notice the other > end died, I think you are Ok. Thinking about this deeper, I have to say I find my decision to use "--stdin" rather silly, given that I know the exact number of revisions, and their SHA1s. But it might make more sense to rewrite the checking part, instead of fork()ing it. Ciao, Dscho ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 16:20 ` Johannes Schindelin @ 2007-02-22 19:10 ` Junio C Hamano 2007-02-22 19:16 ` Johannes Schindelin 0 siblings, 1 reply; 42+ messages in thread From: Junio C Hamano @ 2007-02-22 19:10 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Mark Levedahl Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > Thinking about this deeper, I have to say I find my decision to use > "--stdin" rather silly, given that I know the exact number of revisions, > and their SHA1s. The only worry I would have is if that exact number is too large that you cannot pass them sensibly in argv[]. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 19:10 ` Junio C Hamano @ 2007-02-22 19:16 ` Johannes Schindelin 2007-02-22 20:05 ` Junio C Hamano 0 siblings, 1 reply; 42+ messages in thread From: Johannes Schindelin @ 2007-02-22 19:16 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Mark Levedahl Hi, On Thu, 22 Feb 2007, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > > Thinking about this deeper, I have to say I find my decision to use > > "--stdin" rather silly, given that I know the exact number of revisions, > > and their SHA1s. > > The only worry I would have is if that exact number is too large > that you cannot pass them sensibly in argv[]. I thought there is only a limitation in bash? Well, anyway, I think that it makes sense writing just a little loop to find if we have the revs or not, to be able to warn about them properly. I'll do that. Ciao, Dscho ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 19:16 ` Johannes Schindelin @ 2007-02-22 20:05 ` Junio C Hamano 2007-02-22 20:25 ` Johannes Schindelin 0 siblings, 1 reply; 42+ messages in thread From: Junio C Hamano @ 2007-02-22 20:05 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Mark Levedahl Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> The only worry I would have is if that exact number is too large >> that you cannot pass them sensibly in argv[]. > > I thought there is only a limitation in bash? I am sure kernel folks on the list would correct me, but my understanding is that is execve(2) limitation and you would get E2BIG. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 20:05 ` Junio C Hamano @ 2007-02-22 20:25 ` Johannes Schindelin 0 siblings, 0 replies; 42+ messages in thread From: Johannes Schindelin @ 2007-02-22 20:25 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Mark Levedahl Hi, On Thu, 22 Feb 2007, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > >> The only worry I would have is if that exact number is too large > >> that you cannot pass them sensibly in argv[]. > > > > I thought there is only a limitation in bash? > > I am sure kernel folks on the list would correct me, but my > understanding is that is execve(2) limitation and you would get > E2BIG. Okay. So how about this (on top of my previous fixup): -- snipsnap -- [PATCH] git-bundle: avoid fork() in verify_bundle() We can use the revision walker easily for checking if the prerequisites are met, instead of fork()ing off a rev-list, which would list only the first unmet prerequisite. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- builtin-bundle.c | 75 ++++++++++++++++++++++++++++++++++------------------- 1 files changed, 48 insertions(+), 27 deletions(-) diff --git a/builtin-bundle.c b/builtin-bundle.c index 521bbda..4ac06f9 100644 --- a/builtin-bundle.c +++ b/builtin-bundle.c @@ -19,7 +19,7 @@ static const char bundle_signature[] = "# v2 git bundle\n"; struct ref_list { unsigned int nr, alloc; - struct { + struct ref_list_entry { unsigned char sha1[20]; char *name; } *list; @@ -167,33 +167,54 @@ static int verify_bundle(struct bundle_header *header) * to be verbose about the errors */ struct ref_list *p = &header->prerequisites; - char **argv; - int pid, out, i, ret = 0; - char buffer[1024]; + struct rev_info revs; + const char *argv[] = {NULL, "--all"}; + struct object_array refs; + struct commit *commit; + int i, ret = 0, req_nr; + const char *message = "The bundle requires these lacking revs:"; - argv = xmalloc((p->nr + 4) * sizeof(const char *)); - argv[0] = "rev-list"; - argv[1] = "--not"; - argv[2] = "--all"; - for (i = 0; i < p->nr; i++) - argv[i + 3] = xstrdup(sha1_to_hex(p->list[i].sha1)); - argv[p->nr + 3] = NULL; - out = -1; - pid = fork_with_pipe((const char **)argv, NULL, &out); - if (pid < 0) - return error("Could not fork rev-list"); - while (read_string(out, buffer, sizeof(buffer)) > 0) - ; /* do nothing */ - close(out); - for (i = 0; i < p->nr; i++) - free(argv[i + 3]); - free(argv); - - while (waitpid(pid, &i, 0) < 0) - if (errno != EINTR) - return -1; - if (!ret && (!WIFEXITED(i) || WEXITSTATUS(i))) - return error("At least one prerequisite is lacking."); + init_revisions(&revs, NULL); + for (i = 0; i < p->nr; i++) { + struct ref_list_entry *e = p->list + i; + struct object *o = parse_object(e->sha1); + if (o) { + o->flags |= BOUNDARY_SHOW; + add_pending_object(&revs, o, e->name); + continue; + } + if (++ret == 1) + error(message); + error("%s %s", sha1_to_hex(e->sha1), e->name); + } + if (revs.pending.nr == 0) + return ret; + req_nr = revs.pending.nr; + setup_revisions(2, argv, &revs, NULL); + + memset(&refs, 0, sizeof(struct object_array)); + for (i = 0; i < revs.pending.nr; i++) { + struct object_array_entry *e = revs.pending.objects + i; + add_object_array(e->item, e->name, &refs); + } + + prepare_revision_walk(&revs); + + i = req_nr; + while (i && (commit = get_revision(&revs))) + if (commit->object.flags & BOUNDARY_SHOW) + i--; + + for (i = 0; i < req_nr; i++) + if (!(refs.objects[i].item->flags & SHOWN)) { + if (++ret == 1) + error(message); + error("%s %s", sha1_to_hex(refs.objects[i].item->sha1), + refs.objects[i].name); + } + + for (i = 0; i < refs.nr; i++) + clear_commit_marks((struct commit *)refs.objects[i].item, -1); return ret; } -- 1.5.0.1.2214.gf22e-dirty ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 6:56 ` Junio C Hamano 2007-02-22 7:08 ` Junio C Hamano @ 2007-02-22 16:17 ` Johannes Schindelin 2007-02-23 2:32 ` Mark Levedahl 2 siblings, 0 replies; 42+ messages in thread From: Johannes Schindelin @ 2007-02-22 16:17 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Mark Levedahl Hi, On Wed, 21 Feb 2007, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > > It was updated to make git-bundle a builtin, and get rid of the tar > > format: now, the first line is supposed to say "# v2 git bundle", the next > > lines either contain a prerequisite ("-" followed by the hash of the > > needed commit), or a ref (the hash of a commit, followed by the name of > > the ref), and finally the pack. As a result, the bundle argument can be > > "-" now. > > That in BNF would be? > > bundle = signature prereq* ref* pack more like bundle = header pack header = signature prereq* ref* nl > I suspect that we might want to possibly: > > - have checksum protection over the part before pack data? Possibly. But would the sha1's not be sufficient? I.e. if the header is corrupted, chances are that the commits are no longer verified correctly. > - give descriptive name to prereq, so that an error message can > say "you need v1.4.4" instead of "you need e267c2f6"? My idea was to allow --since=<date>. You don't have a name there. > - make the fields extensible without updating "v2"? You mean, like, warn about unknown header lines? Yes, that's really easy. > > diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl > > index a2d6268..f61c77a 100755 > > --- a/Documentation/cmd-list.perl > > +++ b/Documentation/cmd-list.perl > > @@ -70,6 +70,7 @@ git-archive mainporcelain > > git-bisect mainporcelain > > git-blame ancillaryinterrogators > > git-branch mainporcelain > > +git-bundle mainporcelain > > git-cat-file plumbinginterrogators > > git-checkout-index plumbingmanipulators > > git-checkout mainporcelain > > Is this really a mainporcelain? git bundle create <file>? > > diff --git a/Documentation/git-bundle.txt b/Documentation/git-bundle.txt > > new file mode 100644 > > index 0000000..4ea9e85 > > ... > > +Author > > +------ > > +Written by Mark Levedahl <mdl123@verizon.net> > > ... and Dscho. ... not the Documentation (only patched a little to reflect the "--" change). > > diff --git a/builtin-bundle.c b/builtin-bundle.c > > new file mode 100644 > > index 0000000..4bd596a > > --- /dev/null > > +++ b/builtin-bundle.c > > ... > > +static const char *bundle_usage="git-bundle (--create <bundle> <git-rev-list args> | --verify <bundle> | --list-heads <bundle> [refname]... | --unbundle <bundle> [refname]... )"; > > I thought you removed "--" prefixes from subcommands. Yes, and I forgot it there. Will fix. > > +struct bundle_header { > > + struct ref_list prerequisites, references; > > +}; > > (minor style) I find these two members each on its own line > easier to read, as in: > > struct bundle_header { > struct ref_list prerequisites; > struct ref_list references; > }; Will fix. > > +/* this function returns the length of the string */ > > +static int read_string(int fd, char *buffer, int size) > > +{ > > + int i; > > + for (i = 0; i < size - 1; i++) { > > + int count = read(fd, buffer + i, 1); > > + if (count < 0) > > + return error("Read error: %s", strerror(errno)); > > + if (count == 0) { > > + i--; > > + break; > > + } > > + if (buffer[i] == '\n') > > + break; > > + } > > + buffer[i + 1] = '\0'; > > + return i + 1; > > +} > > At least xread() please. I know the reason you read one byte at > a time is because you cannot over-read into pack area, but > somehow this makes me go "hmmmmmm". It's not performance > critical so let's leave it that way. Is bundle supposed to be > streamable? Otherwise we could probably seek back. I wanted unbundle to be able to read from stdin (I did not yet allow "git fetch - master:master"). But I somehow forgot about xread(). D'oh. Will fix. > > +/* returns an fd */ > > +static int read_header(const char *path, struct bundle_header *header) { > > + char buffer[1024]; > > + int fd = open(path, O_RDONLY); > > + > > + if (fd < 0) > > + return error("could not open '%s'", path); > > + if (read_string(fd, buffer, sizeof(buffer)) < 0 || > > + strcmp(buffer, bundle_signature)) { > > + close(fd); > > + return error("'%s' does not look like a v2 bundle file", path); > > + } > > + while (read_string(fd, buffer, sizeof(buffer)) > 0 > > + && buffer[0] != '\n') { > > + int offset = buffer[0] == '-' ? 1 : 0; > > + int len = strlen(buffer); > > + unsigned char sha1[20]; > > + struct ref_list *list = offset ? &header->prerequisites > > + : &header->references; > > + if (get_sha1_hex(buffer + offset, sha1)) { > > + close(fd); > > + return error("invalid SHA1: %s", buffer); > > + } > > + if (buffer[len - 1] == '\n') > > + buffer[len - 1] = '\0'; > > + add_to_ref_list(sha1, buffer + 41 + offset, list); > > + } > > + return fd; > > +} > > Don't you want to look at and validate buffer[40+offset] in any > way, other than what get_sha1_hex() does (which is issspace())? Right. Will fix. (Although I will only write two "s" instead of three...) > > +/* if in && *in >= 0, take that as input file descriptor instead */ > > +static int fork_with_pipe(const char **argv, int *in, int *out) > > +{ > > + int needs_in, needs_out; > > + int fdin[2], fdout[2], pid; > > + > > + needs_in = in && *in < 0; > > + if (needs_in) { > > + if (pipe(fdin) < 0) > > + return error("could not setup pipe"); > > + *in = fdin[1]; > > + } > > + > > + needs_out = out && *out < 0; > > + if (needs_out) { > > + if (pipe(fdout) < 0) > > + return error("could not setup pipe"); > > + *out = fdout[0]; > > + } > > + > > + if ((pid = fork()) < 0) { > > + if (needs_in) { > > + close(fdin[0]); > > + close(fdin[1]); > > + } > > + if (needs_out) { > > + close(fdout[0]); > > + close(fdout[1]); > > + } > > + return error("could not fork"); > > + } > > + if (!pid) { > > + if (needs_in) { > > + dup2(fdin[0], 0); > > + close(fdin[0]); > > + close(fdin[1]); > > + } else if (in) > > + dup2(*in, 0); > > + if (needs_out) { > > + dup2(fdout[1], 1); > > + close(fdout[0]); > > + close(fdout[1]); > > + } else if (out) > > + dup2(*out, 1); > > + exit(execv_git_cmd(argv)); > > + } > > Do you need to close *in and *out that are given by the caller > after dup2() in the child? I probably should. Will fix. > > +static int verify_bundle(struct bundle_header *header) > > +{ > > + /* > > + * Do fast check, then if any prereqs are missing then go line by line > > + * to be verbose about the errors > > + */ > > + struct ref_list *p = &header->prerequisites; > > + const char *argv[5] = {"rev-list", "--stdin", "--not", "--all", NULL}; > > + int pid, in, out, i, ret = 0; > > + char buffer[1024]; > > + > > + in = out = -1; > > + pid = fork_with_pipe(argv, &in, &out); > > + if (pid < 0) > > + return error("Could not fork rev-list"); > > + if (!fork()) { > > + for (i = 0; i < p->nr; i++) { > > + write(in, sha1_to_hex(p->list[i].sha1), 40); > > + write(in, "\n", 1); > > + } > > + close(in); > > + exit(0); > > + } > > + close(in); > > What if write() fails? That can happen when one of the objects > you feed here, or its parent objects, is missing from your > repository -- receiving rev-list would die() and the writing > child would sigpipe. The error is caught afterwards, with WEXITSTATUS(), no? It might be cleaner to check, but is it really necessary? > I also wonder who waits for this child... rev-list. It only exits when that child closes the input. > > + while (read_string(out, buffer, sizeof(buffer)) > 0) { > > + if (ret++ == 0) > > + error ("The bundle requires the following commits you lack:"); > > + fprintf(stderr, "%s", buffer); > > + } > > + close(out); > > I do not think this error message would buy you anything. If > you say: > > echo commit | rev-list --stdin --not --all > > and you got commit back, that means you _do_ have that commit, > and that commit reaches some ref you have (objects associated to > that commit may be still missing). If commit is truly missing, > then you do not get commit output from rev-list as it cannot > even determine if it is "--not --all" or not -- it would error > out, which is the more important check to see if prereqs are > lacking. Yes, I realized that rev-list die()s, and does not output the missing revs -- as git-bundle.sh expected it to. When I found out about that, I failed to remove that bogus error. Will fix. > > + while (waitpid(pid, &i, 0) < 0) > > + if (errno != EINTR) > > + return -1; > > + if (!ret && (!WIFEXITED(i) || WEXITSTATUS(i))) > > + return error("At least one prerequisite is lacking."); > > So you would want to keep this error, perhaps even make it more > helpful by suggesting which ones are missing. rev-list die()s with the first missing. It does not continue, printing other missing revs. > > +static int list_heads(struct bundle_header *header, int argc, const char **argv) > > +{ > > + int i; > > + struct ref_list *r = &header->references; > > + > > + for (i = 0; i < r->nr; i++) { > > + if (argc > 1) { > > + int j; > > + for (j = 1; j < argc; j++) > > + if (!strcmp(r->list[i].name, argv[j])) > > + break; > > + if (j == argc) > > + continue; > > + } > > + printf("%s %s\n", sha1_to_hex(r->list[i].sha1), > > + r->list[i].name); > > + } > > + return 0; > > +} > > I wonder if we want to have a way to list prereqs in similar way. That is what verify should be for. Maybe add "-v" to the "verify" subcommand, make it properly builtin, and output non-missing prerequisites only with "-v"? > > +static void show_commit(struct commit *commit) > > +{ > > + write(1, sha1_to_hex(commit->object.sha1), 40); > > + write(1, "\n", 1); > > + if (commit->parents) { > > + free_commit_list(commit->parents); > > + commit->parents = NULL; > > + } > > +} > > (everywhere) We would want write_in_full() with error handling > that dies even on EPIPE. Or write_or_die()? I mean, if write() fails, we cannot sanely continue anyway, right? > > +static void show_object(struct object_array_entry *p) > > +{ > > + /* An object with name "foo\n0000000..." can be used to > > + * * confuse downstream git-pack-objects very badly. > > + * */ > > An interesting way to wrap comments. Bad, bad vi. Will fix. > > + /* write prerequisites */ > > + memcpy(argv_boundary + 2, argv + 1, argc * sizeof(const char *)); > > + argv_boundary[0] = "rev-list"; > > + argv_boundary[1] = "--boundary"; > > + argv_boundary[argc + 1] = NULL; > > + out = -1; > > + pid = fork_with_pipe(argv_boundary, NULL, &out); > > + if (pid < 0) > > + return -1; > > + while ((i = read_string(out, buffer, sizeof(buffer))) > 0) > > + if (buffer[0] == '-') > > + write(bundle_fd, buffer, i); > > It would be helpful for the recipient if you can append output > >from git-describe (or name-rev) when the buffer lacks "name". Hmm. This could take a long time, as -describe is not really fast. ATM name-rev is not fast either, but I have plans to make it so. So obiously, I would prefer name-rev output. Comments? > > +static int unbundle(struct bundle_header *header, int bundle_fd, > > + int argc, const char **argv) > > +{ > > + const char *argv_index_pack[] = {"index-pack", "--stdin", NULL}; > > + int pid, status, dev_null; > > + > > + if (verify_bundle(header)) > > + return -1; > > + dev_null = open("/dev/null", O_WRONLY); > > + pid = fork_with_pipe(argv_index_pack, &bundle_fd, &dev_null); > > + if (pid < 0) > > + return error("Could not spawn index-pack"); > > + close(bundle_fd); > > + while (waitpid(pid, &status, 0) < 0) > > + if (errno != EINTR) > > + return error("index-pack died"); > > + if (!WIFEXITED(status) || WEXITSTATUS(status)) > > + return error("index-pack exited with status %d", > > + WEXITSTATUS(status)); > > + return list_heads(header, argc, argv); > > +} > > We might want to later use --keep for this as well... Yes, later. It is such a low-hanging fruit that I'd prefer somebody else to get involved with the code. > > diff --git a/index-pack.c b/index-pack.c > > index fa9a0e7..5ccf4c4 100644 > > --- a/index-pack.c > > +++ b/index-pack.c > > @@ -457,8 +457,8 @@ static void parse_pack_objects(unsigned char *sha1) > > /* If input_fd is a file, we should have reached its end now. */ > > if (fstat(input_fd, &st)) > > die("cannot fstat packfile: %s", strerror(errno)); > > - if (S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > > - die("pack has junk at the end"); > > + if (input_fd && S_ISREG(st.st_mode) && st.st_size != consumed_bytes) > > + die("pack has junk at the end: 0%o, %d, %d %d", st.st_mode, (int)st.st_size, (int)consumed_bytes, input_fd); > > > > if (!nr_deltas) > > return; > > ?? As I said in my reply to Nico, if input_fd is 0, but really comes from a file, this check will fail (the bundle is _strictly_ larger than the pack). Would you like the fixes on top of the big commit, or a replacement patch? Ciao, Dscho ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 6:56 ` Junio C Hamano 2007-02-22 7:08 ` Junio C Hamano 2007-02-22 16:17 ` Johannes Schindelin @ 2007-02-23 2:32 ` Mark Levedahl 2007-02-23 4:39 ` Junio C Hamano 2 siblings, 1 reply; 42+ messages in thread From: Mark Levedahl @ 2007-02-23 2:32 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Schindelin, git Junio C Hamano wrote: >> diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl >> index a2d6268..f61c77a 100755 >> --- a/Documentation/cmd-list.perl >> +++ b/Documentation/cmd-list.perl >> @@ -70,6 +70,7 @@ git-archive mainporcelain >> git-bisect mainporcelain >> git-blame ancillaryinterrogators >> git-branch mainporcelain >> +git-bundle mainporcelain >> git-cat-file plumbinginterrogators >> git-checkout-index plumbingmanipulators >> git-checkout mainporcelain >> > > Is this really a mainporcelain? > I would say ancillarymanipulators (or perhaps synchelpers). > > git bundle has four commands: create, verify, list-heads, and unbundle. The last two are pure helper functions, basically plumbing. Verify is questionable as to where it lies. But, create is the only way to create a bundle, is logically equivalent to git push as a user command to move data, so I called it mainporcelain because that is how git push is classified. >> + /* write prerequisites */ >> + memcpy(argv_boundary + 2, argv + 1, argc * sizeof(const char *)); >> + argv_boundary[0] = "rev-list"; >> + argv_boundary[1] = "--boundary"; >> + argv_boundary[argc + 1] = NULL; >> + out = -1; >> + pid = fork_with_pipe(argv_boundary, NULL, &out); >> + if (pid < 0) >> + return -1; >> + while ((i = read_string(out, buffer, sizeof(buffer))) > 0) >> + if (buffer[0] == '-') >> + write(bundle_fd, buffer, i); >> > > It would be helpful for the recipient if you can append output > from git-describe (or name-rev) when the buffer lacks "name". > I found the actual commit summary message (i.e., git-rev-list --pretty=one --max-count=1 sha1) the most useful of the various summaries available. Mark ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-23 2:32 ` Mark Levedahl @ 2007-02-23 4:39 ` Junio C Hamano 0 siblings, 0 replies; 42+ messages in thread From: Junio C Hamano @ 2007-02-23 4:39 UTC (permalink / raw) To: Mark Levedahl; +Cc: Johannes Schindelin, git Mark Levedahl <mdl123@verizon.net> writes: > I found the actual commit summary message (i.e., git-rev-list > --pretty=one --max-count=1 sha1) the most useful of the various > summaries available. I would agree that would be the case, and Johannes's follow-up seems to do that. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [PATCH] Add git-bundle: move objects and references by archive 2007-02-22 0:59 ` Johannes Schindelin 2007-02-22 3:28 ` Nicolas Pitre 2007-02-22 6:56 ` Junio C Hamano @ 2007-02-22 9:31 ` Johannes Sixt 2 siblings, 0 replies; 42+ messages in thread From: Johannes Sixt @ 2007-02-22 9:31 UTC (permalink / raw) To: git Johannes Schindelin wrote: > +/* if in && *in >= 0, take that as input file descriptor instead */ > +static int fork_with_pipe(const char **argv, int *in, int *out) > +{ > + int needs_in, needs_out; > + int fdin[2], fdout[2], pid; > + > + needs_in = in && *in < 0; > + if (needs_in) { > + if (pipe(fdin) < 0) > + return error("could not setup pipe"); > + *in = fdin[1]; > + } > + > + needs_out = out && *out < 0; > + if (needs_out) { > + if (pipe(fdout) < 0) > + return error("could not setup pipe"); > + *out = fdout[0]; > + } > + > + if ((pid = fork()) < 0) { > + if (needs_in) { > + close(fdin[0]); > + close(fdin[1]); > + } > + if (needs_out) { > + close(fdout[0]); > + close(fdout[1]); > + } > + return error("could not fork"); > + } > + if (!pid) { > + if (needs_in) { > + dup2(fdin[0], 0); > + close(fdin[0]); > + close(fdin[1]); > + } else if (in) > + dup2(*in, 0); > + if (needs_out) { > + dup2(fdout[1], 1); > + close(fdout[0]); > + close(fdout[1]); > + } else if (out) > + dup2(*out, 1); > + exit(execv_git_cmd(argv)); > + } > + if (needs_in) > + close(fdin[0]); > + if (needs_out) > + close(fdout[1]); > + return pid; > +} This function looks very similar to spawnvppe, which I use in the MinGW port for a number of fork/exec pairs. Do you see a chance to make this into a helper that can be used in more places (so that it reduces the differences to the MinGW code)? > + in = out = -1; > + pid = fork_with_pipe(argv, &in, &out); > + if (pid < 0) > + return error("Could not fork rev-list"); > + if (!fork()) { > + for (i = 0; i < p->nr; i++) { > + write(in, sha1_to_hex(p->list[i].sha1), 40); > + write(in, "\n", 1); > + } > + close(in); > + exit(0); > + } > + close(in); Oops, fork error check missing? -- Hannes ^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2007-02-23 4:39 UTC | newest] Thread overview: 42+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-02-16 12:53 Re: [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer Mark Levedahl 2007-02-16 19:57 ` Junio C Hamano 2007-02-16 23:21 ` Mark Levedahl 2007-02-17 6:57 ` Junio C Hamano 2007-02-17 14:40 ` Mark Levedahl 2007-02-17 17:53 ` Junio C Hamano 2007-02-17 14:50 ` Mark Levedahl 2007-02-17 18:41 ` Junio C Hamano 2007-02-18 17:27 ` [PATCH] Add git-bundle: move objects and references by archive Mark Levedahl 2007-02-18 22:47 ` Mark Levedahl 2007-02-16 21:58 ` [PATCH] Add git-unbundle - unpack objects and references for disconnected transfer Mark Levedahl 2007-02-16 22:51 ` Johannes Schindelin -- strict thread matches above, loose matches on Subject: below -- 2007-02-18 22:47 [PATCH] Add git-bundle: move objects and references by archive Mark Levedahl 2007-02-19 1:07 ` Johannes Schindelin 2007-02-19 1:50 ` Junio C Hamano 2007-02-19 2:02 ` Johannes Schindelin 2007-02-19 7:56 ` Shawn O. Pearce 2007-02-19 13:29 ` Mark Levedahl 2007-02-19 15:03 ` Johannes Schindelin 2007-02-19 1:49 ` Junio C Hamano [not found] <Pine.LNX.4.63.0702220157130.22628@wbgn013.biozentrum.uni-wuerz burg.de> 2007-02-22 0:59 ` Johannes Schindelin 2007-02-22 3:28 ` Nicolas Pitre 2007-02-22 15:55 ` Johannes Schindelin 2007-02-22 16:14 ` Simon 'corecode' Schubert 2007-02-22 16:28 ` Nicolas Pitre 2007-02-22 16:37 ` Johannes Schindelin 2007-02-22 16:46 ` Simon 'corecode' Schubert 2007-02-22 17:09 ` Johannes Schindelin 2007-02-22 16:24 ` Nicolas Pitre 2007-02-22 17:12 ` Johannes Schindelin 2007-02-22 17:21 ` Nicolas Pitre 2007-02-22 6:56 ` Junio C Hamano 2007-02-22 7:08 ` Junio C Hamano 2007-02-22 16:20 ` Johannes Schindelin 2007-02-22 19:10 ` Junio C Hamano 2007-02-22 19:16 ` Johannes Schindelin 2007-02-22 20:05 ` Junio C Hamano 2007-02-22 20:25 ` Johannes Schindelin 2007-02-22 16:17 ` Johannes Schindelin 2007-02-23 2:32 ` Mark Levedahl 2007-02-23 4:39 ` Junio C Hamano 2007-02-22 9:31 ` Johannes Sixt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).