Git development
 help / color / mirror / Atom feed
* Re: Gnome chose Git
From: demerphq @ 2009-03-19 21:51 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Jeff King, Git
In-Reply-To: <20090319214432.GV23521@spearce.org>

2009/3/19 Shawn O. Pearce <spearce@spearce.org>:
> "Shawn O. Pearce" <spearce@spearce.org> wrote:
>> demerphq <demerphq@gmail.com> wrote:
>> > Outside of parsing the reflog directly, (which feels wrong and dirty
>> > to me), how does one find out the times that a reflog entry was
>> > created?
>> >
>> > The closest thing i could find was git log -g, but that shows the time
>>
>>   git reflog -g branch@{now}
>
> Arrgh, I of course actually meant
>
>    git log -g branch@{now}
>
>> the @{now} suffix is the magic to make it show the time.

Ah! Much nicer! Thanks.

Is there by any chance any way to set the date format it uses to
something more suitable for machine processing?

Yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

^ permalink raw reply

* Re: Gnome chose Git
From: Shawn O. Pearce @ 2009-03-19 21:53 UTC (permalink / raw)
  To: demerphq; +Cc: Jeff King, Git
In-Reply-To: <9b18b3110903191451u56bbee7biac3a1fee4a36b71d@mail.gmail.com>

demerphq <demerphq@gmail.com> wrote:
> 2009/3/19 Shawn O. Pearce <spearce@spearce.org>:
> > "Shawn O. Pearce" <spearce@spearce.org> wrote:
> >
> > git log -g branch@{now}
> 
> Ah! Much nicer! Thanks.
> 
> Is there by any chance any way to set the date format it uses to
> something more suitable for machine processing?

I don't think so.  If you want to machine process it, why not
just read the reflog directly?  Its a really simple format.

-- 
Shawn.

^ permalink raw reply

* Re: [PATCH] Produce a nicer output in case of sha1_object_info failures in ls-tree -l
From: Junio C Hamano @ 2009-03-19 21:55 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git, Jakub Narebski
In-Reply-To: <20090319203002.GA31014@blimp.localdomain>

Alex Riesen <raa.lkml@gmail.com> writes:

> Initialize the size with 0. The error message is already printed
> by sha1_object_info itself. Otherwise the uninitialized size is
> printed, which does not make any sense at all.
>
> Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
> ---
>
>  builtin-ls-tree.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/builtin-ls-tree.c b/builtin-ls-tree.c
> index fca4631..a8cdafb 100644
> --- a/builtin-ls-tree.c
> +++ b/builtin-ls-tree.c
> @@ -60,7 +60,6 @@ static int show_tree(const unsigned char *sha1, const char *base, int baselen,
>  {
>  	int retval = 0;
>  	const char *type = blob_type;
> -	unsigned long size;
>  
>  	if (S_ISGITLINK(mode)) {
>  		/*
> @@ -91,6 +90,7 @@ static int show_tree(const unsigned char *sha1, const char *base, int baselen,
>  	if (!(ls_options & LS_NAME_ONLY)) {
>  		if (ls_options & LS_SHOW_SIZE) {
>  			if (!strcmp(type, blob_type)) {
> +				unsigned long size = 0;
>  				sha1_object_info(sha1, &size);
>  				printf("%06o %s %s %7lu\t", mode, type,
>  				       abbrev ? find_unique_abbrev(sha1, abbrev)

Hmm, shouldn't you be checking the return value from sha1_object_info()
and skipping the printf() altogether instead?

^ permalink raw reply

* Re: git am from scratch
From: Junio C Hamano @ 2009-03-19 21:57 UTC (permalink / raw)
  To: Jeff King; +Cc: Andreas Gruenbacher, git
In-Reply-To: <20090319210214.GA17589@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> Anyway, here is a not-very-well-tested patch to get "git am" to apply on
> top of an empty repository (i.e., it worked on my utterly simplistic
> test case and I didn't think too hard about what else might have been
> broken). Maybe it will give a good start to somebody who wants to work
> on this.

The patch gets the ball rolling in a right direction, I think.  In
addition, you need to audit --abort and --skip codepaths carefully,
though.

> diff --git a/git-am.sh b/git-am.sh
> index d339075..bcc600d 100755
> --- a/git-am.sh
> +++ b/git-am.sh
> @@ -290,17 +290,23 @@ else
>  		: >"$dotest/rebasing"
>  	else
>  		: >"$dotest/applying"
> -		git update-ref ORIG_HEAD HEAD
> +		if git rev-parse --quiet --verify HEAD; then
> +			git update-ref ORIG_HEAD HEAD
> +		else
> +			rm -f "$GIT_DIR/ORIG_HEAD"
> +		fi
>  	fi
>  fi
>  
>  case "$resolved" in
>  '')
> -	files=$(git diff-index --cached --name-only HEAD --) || exit
> -	if test "$files"
> -	then
> -		: >"$dotest/dirtyindex"
> -		die "Dirty index: cannot apply patches (dirty: $files)"
> +	if git rev-parse --quiet --verify HEAD; then
> +		files=$(git diff-index --cached --name-only HEAD --) || exit
> +		if test "$files"
> +		then
> +			: >"$dotest/dirtyindex"
> +			die "Dirty index: cannot apply patches (dirty: $files)"
> +		fi
>  	fi
>  esac
>  
> @@ -541,7 +547,7 @@ do
>  	fi
>  
>  	tree=$(git write-tree) &&
> -	parent=$(git rev-parse --verify HEAD) &&
> +	parent=$(git rev-parse --quiet --verify HEAD)
>  	commit=$(
>  		if test -n "$ignore_date"
>  		then
> @@ -552,7 +558,7 @@ do
>  			GIT_COMMITTER_DATE="$GIT_AUTHOR_DATE"
>  			export GIT_COMMITTER_DATE
>  		fi &&
> -		git commit-tree $tree -p $parent <"$dotest/final-commit"
> +		git commit-tree $tree ${parent:+-p $parent} <"$dotest/final-commit"
>  	) &&
>  	git update-ref -m "$GIT_REFLOG_ACTION: $FIRSTLINE" HEAD $commit $parent ||
>  	stop_here $this

^ permalink raw reply

* Re: Gnome chose Git
From: demerphq @ 2009-03-19 21:59 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Jeff King, Git
In-Reply-To: <20090319215331.GW23521@spearce.org>

2009/3/19 Shawn O. Pearce <spearce@spearce.org>:
> demerphq <demerphq@gmail.com> wrote:
>> 2009/3/19 Shawn O. Pearce <spearce@spearce.org>:
>> > "Shawn O. Pearce" <spearce@spearce.org> wrote:
>> >
>> > git log -g branch@{now}
>>
>> Ah! Much nicer! Thanks.
>>
>> Is there by any chance any way to set the date format it uses to
>> something more suitable for machine processing?
>
> I don't think so.  If you want to machine process it, why not
> just read the reflog directly?  Its a really simple format.

Mostly my problem with that is that it violates the abstraction. If i
update git and the reflog format changes my script breaks. I dont
necessarily know where it will be located, etc. And while no doubt i
can reverse engineer the format, well, who knows maybe Ill miss
something important, I mean is it documented anywhere?

So i guess if the format were documented (and thus changing it would
break compatibility and be noted in the changes file) then it would be
fine to do so, but it seems to me making a way to access the reflog
data in a structured way via a plumbing level command makes more
sense. (At the very least this abstract the user of having to figure
out where the log is stored).

Yves




-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

^ permalink raw reply

* Re: [PATCH] Produce a nicer output in case of sha1_object_info failures in ls-tree -l
From: Alex Riesen @ 2009-03-19 22:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jakub Narebski
In-Reply-To: <7v4oxp89eb.fsf@gitster.siamese.dyndns.org>

Junio C Hamano, Thu, Mar 19, 2009 22:55:56 +0100:
> Alex Riesen <raa.lkml@gmail.com> writes:
> > @@ -91,6 +90,7 @@ static int show_tree(const unsigned char *sha1, const char *base, int baselen,
> >  	if (!(ls_options & LS_NAME_ONLY)) {
> >  		if (ls_options & LS_SHOW_SIZE) {
> >  			if (!strcmp(type, blob_type)) {
> > +				unsigned long size = 0;
> >  				sha1_object_info(sha1, &size);
> >  				printf("%06o %s %s %7lu\t", mode, type,
> >  				       abbrev ? find_unique_abbrev(sha1, abbrev)
> 
> Hmm, shouldn't you be checking the return value from sha1_object_info()
> and skipping the printf() altogether instead?

But then I cannot know the name of the failed tree entry.

^ permalink raw reply

* Re: [PATCH] Microoptimize strbuf_cmp
From: Junio C Hamano @ 2009-03-19 22:01 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git, Pierre Habouzit
In-Reply-To: <20090319210931.GB31014@blimp.localdomain>

Alex Riesen <raa.lkml@gmail.com> writes:

> Make it inline and cleanup a bit. It is definitely less code
> including object code, but it is not always measurably faster
> (but mostly is).

The only in-tree user seems to be rerere, so inlining for that single
caller will reduce the object side, but I am not sure if this is a good
change in the longer term if we want to encourage the use of strbuf
library.

The rewrite of the logic does seem worth doing, though.

^ permalink raw reply

* Re: [PATCH] Define a version of lstat(2) with posix semantics
From: Junio C Hamano @ 2009-03-19 22:08 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Alex Riesen, Git Mailing List, Johannes Sixt, Jeff King, layer
In-Reply-To: <alpine.DEB.1.00.0903191155300.10279@pacific.mpi-cbg.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi,
>
> On Thu, 19 Mar 2009, Alex Riesen wrote:
>
>> So that Cygwin port can continue work around its supporting
>> library and get access to its faked file attributes.
>> 
>> Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
>> ---
>
> [patch not inlined: therefore you'll have to guess what I am referring to]
>
> It seems quite wrong to define something for other platforms when only 
> Cygwin is affected.
>
> I'd rather just disable WIN32_STAT for Cygwin, because otherwise, we will 
> keep running into issues.

I am inclined to agree with this.

Back when Cygwin was the only choice, it was a way to bring benefit of git
to folks who have to work on Windows, but with the recent advances in
msysgit, probably Cygwin port should return to a role more in line with
the overall Cygwin theme of bringing the more POSIXy sanity into Windows
world.  I personally see a Cygwin port as a vehicle for people who care
about having a POSIXly-correct world where files have executable bits and
lines are terminated with LF on Windows.  If you want to have a system
that is closer to Window's world view, there is (or will be, as msysgit is
still officially marked as alpha) a viable alternative, and the current
"selective cheating" Cygwin port does may benefit nobody.

But I do not work on Windows myself, so please take this only as a mere
uninformed opinion, nothing more.

^ permalink raw reply

* [PATCH] git-am: teach git-am to apply a patch to an unborn branch
From: Nanako Shiraishi @ 2009-03-19 22:12 UTC (permalink / raw)
  To: git

Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com>
---
 git-am.sh     |   33 ++++++++++++++++++++++++++++-----
 t/t4150-am.sh |   15 +++++++++++++++
 2 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/git-am.sh b/git-am.sh
index d339075..c21642b 100755
--- a/git-am.sh
+++ b/git-am.sh
@@ -36,6 +36,13 @@ cd_to_toplevel
 git var GIT_COMMITTER_IDENT >/dev/null ||
 	die "You need to set your committer info first"
 
+if git rev-parse --verify -q HEAD >/dev/null
+then
+	HAS_HEAD=yes
+else
+	HAS_HEAD=
+fi
+
 sq () {
 	for sqarg
 	do
@@ -290,16 +297,26 @@ else
 		: >"$dotest/rebasing"
 	else
 		: >"$dotest/applying"
-		git update-ref ORIG_HEAD HEAD
+		if test -n "$HAS_HEAD"
+		then
+			git update-ref ORIG_HEAD HEAD
+		else
+			git update-ref -d ORIG_HEAD >/dev/null 2>&1
+		fi
 	fi
 fi
 
 case "$resolved" in
 '')
-	files=$(git diff-index --cached --name-only HEAD --) || exit
+	if test -n "$HAS_HEAD"
+	then
+		files=$(git diff-index --cached --name-only HEAD --)
+	else
+		files=$(git ls-files)
+	fi || exit
 	if test "$files"
 	then
-		: >"$dotest/dirtyindex"
+		test -n "$HAS_HEAD" && : >"$dotest/dirtyindex"
 		die "Dirty index: cannot apply patches (dirty: $files)"
 	fi
 esac
@@ -541,7 +558,13 @@ do
 	fi
 
 	tree=$(git write-tree) &&
-	parent=$(git rev-parse --verify HEAD) &&
+	if parent=$(git rev-parse --verify -q HEAD)
+	then
+		pparent="-p $parent"
+	else
+		echo >&2 "applying to an empty history"
+		parent= pparent=
+	fi &&
 	commit=$(
 		if test -n "$ignore_date"
 		then
@@ -552,7 +575,7 @@ do
 			GIT_COMMITTER_DATE="$GIT_AUTHOR_DATE"
 			export GIT_COMMITTER_DATE
 		fi &&
-		git commit-tree $tree -p $parent <"$dotest/final-commit"
+		git commit-tree $tree $pparent <"$dotest/final-commit"
 	) &&
 	git update-ref -m "$GIT_REFLOG_ACTION: $FIRSTLINE" HEAD $commit $parent ||
 	stop_here $this
diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index 5e65afa..b97d102 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -290,4 +290,19 @@ test_expect_success 'am --ignore-date' '
 	echo "$at" | grep "+0000"
 '
 
+test_expect_success 'am in an unborn branch' '
+	rm -fr subdir &&
+	mkdir -p subdir &&
+	git format-patch --numbered-files -o subdir -1 first &&
+	(
+		cd subdir &&
+		git init &&
+		git am 1
+	) &&
+	result=$(
+		cd subdir && git rev-parse HEAD^{tree}
+	) &&
+	test "z$result" = "z$(git rev-parse first^{tree})"
+'
+
 test_done
-- 
1.6.2.1

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

^ permalink raw reply related

* Re: [PATCH] Produce a nicer output in case of sha1_object_info failures in ls-tree -l
From: Junio C Hamano @ 2009-03-19 22:13 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git, Jakub Narebski
In-Reply-To: <20090319220020.GA8433@blimp.localdomain>

Alex Riesen <raa.lkml@gmail.com> writes:

> Junio C Hamano, Thu, Mar 19, 2009 22:55:56 +0100:
>> Alex Riesen <raa.lkml@gmail.com> writes:
>> > @@ -91,6 +90,7 @@ static int show_tree(const unsigned char *sha1, const char *base, int baselen,
>> >  	if (!(ls_options & LS_NAME_ONLY)) {
>> >  		if (ls_options & LS_SHOW_SIZE) {
>> >  			if (!strcmp(type, blob_type)) {
>> > +				unsigned long size = 0;
>> >  				sha1_object_info(sha1, &size);
>> >  				printf("%06o %s %s %7lu\t", mode, type,
>> >  				       abbrev ? find_unique_abbrev(sha1, abbrev)
>> 
>> Hmm, shouldn't you be checking the return value from sha1_object_info()
>> and skipping the printf() altogether instead?
>
> But then I cannot know the name of the failed tree entry.

Why?

	if (sha1_object_info() == OBJ_BAD)
		die("object recorded at tree entry %s is bad", pathname);
	printf ...

^ permalink raw reply

* Git Large Object Support Proposal
From: Scott Chacon @ 2009-03-19 22:14 UTC (permalink / raw)
  To: git list

I have been thinking about this for a while, so I wanted to get some
feedback. I've been seeing a number of people interested in using Git
for game development and whatnot, or otherwise committing huge files.
This will occasionally wreak some havoc on our servers (GitHub)
because of the memory mapping involved.  Thus, we would really like to
see a nicer way for Git to handle big files.

There are two proposals on the GSoC page to deal with this - the
'remote alternates/lazy clone' idea and the 'sparse/narrow clone'
idea.  I'm wondering if instead it might be an interesting idea to
concentrate on the 'stub objects' for large blobs that Jakub was
talking about a few months ago:

http://markmail.org/message/my4kvrhsza2yjmlt

But where Git instead stores a stub object and the large binary object
is pulled in via a separate mechanism. I was thinking that the client
could set a max file size and when a binary object larger than that is
staged, Git instead writes a stub blob like:

==
blob [size]\0
[sha of large blob]
==

Then in the tree, we give the stubbed large file a special mode or type:

==
100644 blob 3bb0e8592a41ae3185ee32266c860714980dbed7 README
040000 tree 557b70d2374ae77869711cb583e6d59b8aad5e8b lib
150000 blob 502feb557e2097d38a643e336f722525bc7ea077 big-ass-file.mpeg
==

Sort of like a symlink, but instead of the blob it points to
containing the link path, it just contains the SHA of the real blob.
Then we can have a command like 'git media' or something that helps
manage those, pull them down from a specified server (specified in a
.gitmedia file) and transfer new ones up before a push is allowed,
etc.  This makes it sort of a cross between a symlink and a submodule.

== .git/config
[media]
    push-url = [aws/scp/sftp/etc server]
    password = [write password]
    token = [write token]

== .gitmedia
[server]
    pull-url = [aws/scp/sftp/etc read only url]

This might be nice because all the objects would be local, so most of
the changes to tools should be rather small - we can't
merge/diff/blame large binary stuff really anyhow, right?  Also, the
really large files could be written and served over protocols that are
better for large file transfer (scp, sftp, etc) - the media server
could be different than the git server.  Then our servers can stop
choking when someone tries to add and push a 2 gig file.

If two users have different settings, one would simply have the stub
and the other not, the 'git media update' could check the local db
first before fetching.  If you change the max-file-size at some point,
the trees would just either stop using the stubs (if you lowered it)
for anything that now fits under the size limit, or start using stubs
for files that are now over it.

The workflow may go something like this:

$ cd git-repo
$ cp ~/huge-file.mpg .
$ git media add s3://chacon-media
# wrote new media server url to .gitmedia
$ git add .
# huge-file.mpg is larger than max-file-size (10M) and will be added
as media (see 'git media')
$ git status
# On branch master
#
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#	new file:   .gitmedia
#	new media:   huge-file.mpg
#
$ git push
Uploading new media to s3://chacon-media
Uploading media files 100% (5/5), done.
New media uploaded, pushing to Git server
Counting objects: 14, done.
Compressing objects: 100% (9/9), done.
Writing objects: 100% (10/10), 1.04 KiB, done.
Total 10 (delta 4), reused 0 (delta 0)
To git@github.com:schacon/mediaproject.git
 + dbb5d00...9647674 master -> master


On the client side we would have something like this:

$ git clone git://github.com/schacon/mediaproject.git
Initialized empty Git repository in /private/tmp/simplegit/.git/
remote: Counting objects: 270, done.
remote: Compressing objects: 100% (148/148), done.
remote: Total 270 (delta 103), reused 198 (delta 77)
Receiving objects: 100% (270/270), 24.31 KiB, done.
Resolving deltas: 100% (103/103), done.
# You have unfetched media, run 'git media update' to get large media files
$ git status
# On branch master
#
# Media files to be fetched:
#   (use "git media update <file>..." to fetch)
#
#	unfetched:   huge-file.mpg
#
$ git media update
Fetching media from s3://chacon-media
Fetching media files 100% (1/1), done.


Anyhow, you get the picture.  I would be happy to try to get a proof
of concept of this done, but I wanted to know if there are any serious
objections to this approach to large media.

^ permalink raw reply

* Re: [PATCH] git-am: teach git-am to apply a patch to an unborn branch
From: Junio C Hamano @ 2009-03-19 22:21 UTC (permalink / raw)
  To: Nanako Shiraishi; +Cc: git
In-Reply-To: <20090320071231.6117@nanako3.lavabit.com>

Nanako Shiraishi <nanako3@lavabit.com> writes:

> Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com>
> ---
>  git-am.sh     |   33 ++++++++++++++++++++++++++++-----
>  t/t4150-am.sh |   15 +++++++++++++++
>  2 files changed, 43 insertions(+), 5 deletions(-)
>
> diff --git a/git-am.sh b/git-am.sh
> index d339075..c21642b 100755
> --- a/git-am.sh
> +++ b/git-am.sh
> @@ -36,6 +36,13 @@ cd_to_toplevel
>  git var GIT_COMMITTER_IDENT >/dev/null ||
>  	die "You need to set your committer info first"
>  
> +if git rev-parse --verify -q HEAD >/dev/null
> +then
> +	HAS_HEAD=yes
> +else
> +	HAS_HEAD=
> +fi
> +

Probably nicer this way than Peff's as I suspect we would need to special
case the unborn-branch case a lot more.  Have you tried --skip and --abort
codepaths with your patch?

>  sq () {
>  	for sqarg
>  	do
> @@ -290,16 +297,26 @@ else
>  		: >"$dotest/rebasing"
>  	else
>  		: >"$dotest/applying"
> -		git update-ref ORIG_HEAD HEAD
> +		if test -n "$HAS_HEAD"
> +		then
> +			git update-ref ORIG_HEAD HEAD
> +		else
> +			git update-ref -d ORIG_HEAD >/dev/null 2>&1
> +		fi

So is this part.

>  	fi
>  fi
>  
>  case "$resolved" in
>  '')
> -	files=$(git diff-index --cached --name-only HEAD --) || exit
> +	if test -n "$HAS_HEAD"
> +	then
> +		files=$(git diff-index --cached --name-only HEAD --)
> +	else
> +		files=$(git ls-files)
> +	fi || exit
>  	if test "$files"
>  	then
> -		: >"$dotest/dirtyindex"
> +		test -n "$HAS_HEAD" && : >"$dotest/dirtyindex"
>  		die "Dirty index: cannot apply patches (dirty: $files)"
>  	fi
>  esac

And here...

> @@ -541,7 +558,13 @@ do
>  	fi
>  
>  	tree=$(git write-tree) &&
> -	parent=$(git rev-parse --verify HEAD) &&
> +	if parent=$(git rev-parse --verify -q HEAD)
> +	then
> +		pparent="-p $parent"
> +	else
> +		echo >&2 "applying to an empty history"
> +		parent= pparent=
> +	fi &&
>  	commit=$(
>  		if test -n "$ignore_date"
>  		then
> @@ -552,7 +575,7 @@ do
>  			GIT_COMMITTER_DATE="$GIT_AUTHOR_DATE"
>  			export GIT_COMMITTER_DATE
>  		fi &&
> -		git commit-tree $tree -p $parent <"$dotest/final-commit"
> +		git commit-tree $tree $pparent <"$dotest/final-commit"

Peff's ${a+something $a} trick looks more "expert" here ;-).

> diff --git a/t/t4150-am.sh b/t/t4150-am.sh
> index 5e65afa..b97d102 100755
> --- a/t/t4150-am.sh
> +++ b/t/t4150-am.sh
> @@ -290,4 +290,19 @@ test_expect_success 'am --ignore-date' '
>  	echo "$at" | grep "+0000"
>  '
>  
> +test_expect_success 'am in an unborn branch' '
> +	rm -fr subdir &&
> +	mkdir -p subdir &&
> +	git format-patch --numbered-files -o subdir -1 first &&
> +	(
> +		cd subdir &&
> +		git init &&
> +		git am 1
> +	) &&
> +	result=$(
> +		cd subdir && git rev-parse HEAD^{tree}
> +	) &&
> +	test "z$result" = "z$(git rev-parse first^{tree})"
> +'
> +

Looks good.

^ permalink raw reply

* Re: [PATCH] Microoptimize strbuf_cmp
From: Alex Riesen @ 2009-03-19 22:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Pierre Habouzit
In-Reply-To: <7vvdq56ukb.fsf@gitster.siamese.dyndns.org>

It can be less object code and may be even faster, even if at the
moment there is no callers to take an advantage of that. This
implementation can be trivially made inlinable later.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
---

Junio C Hamano, Thu, Mar 19, 2009 23:01:40 +0100:
> Alex Riesen <raa.lkml@gmail.com> writes:
> 
> > Make it inline and cleanup a bit. It is definitely less code
> > including object code, but it is not always measurably faster
> > (but mostly is).
> 
> The only in-tree user seems to be rerere, so inlining for that single
> caller will reduce the object side, but I am not sure if this is a good
> change in the longer term if we want to encourage the use of strbuf
> library.
> 
> The rewrite of the logic does seem worth doing, though.

But then it is only a half of the micro-optimization. In this case,
the cost of call to the function's code is comparable with the change
of the code.

Anyway, FWIW.

 strbuf.c |   13 +++++--------
 1 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index 6ed0684..bfbd816 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -139,14 +139,11 @@ void strbuf_list_free(struct strbuf **sbs)
 
 int strbuf_cmp(const struct strbuf *a, const struct strbuf *b)
 {
-	int cmp;
-	if (a->len < b->len) {
-		cmp = memcmp(a->buf, b->buf, a->len);
-		return cmp ? cmp : -1;
-	} else {
-		cmp = memcmp(a->buf, b->buf, b->len);
-		return cmp ? cmp : a->len != b->len;
-	}
+	int len = a->len < b->len ? a->len: b->len;
+	int cmp = memcmp(a->buf, b->buf, len);
+	if (cmp)
+		return cmp;
+	return a->len < b->len ? -1: a->len != b->len;
 }
 
 void strbuf_splice(struct strbuf *sb, size_t pos, size_t len,
-- 
1.6.2.1.237.g7206c6

^ permalink raw reply related

* Re: Git Large Object Support Proposal
From: Junio C Hamano @ 2009-03-19 22:31 UTC (permalink / raw)
  To: Scott Chacon; +Cc: git list
In-Reply-To: <d411cc4a0903191514n1e524ebava5895d708a2927c4@mail.gmail.com>

Scott Chacon <schacon@gmail.com> writes:

> But where Git instead stores a stub object and the large binary object
> is pulled in via a separate mechanism. I was thinking that the client
> could set a max file size and when a binary object larger than that is
> staged, Git instead writes a stub blob like:
>
> ==
> blob [size]\0
> [sha of large blob]
> ==

An immediate pair of questions are, if you can solve the issue by
delegating large media to somebody else (i.e. "media server"), and that
somebody else can solve the issues you are having, (1) what happens if you
lower that "large" threashold to "0 byte"?  Does that somebody else still
work fine, and does the git that uses indirection also still work fine?
If so why are you using git instead of that somebody else altogether?  and
(2) what prevents us from stealing the trick that somebody else uses so
that git itself can natively handle large blobs without indirection?

Without thinking the ramifications through myself, this sounds pretty much
like a band-aid and will nend up hitting the same "blob is larger than we
can handle" issue when you follow the indirection eventually, but that is
just my gut feeling.

This is an off-topic "By the way", but has another topic addressed to you
on git-scm.com/about resolved in any way yet?

^ permalink raw reply

* [PATCH] Produce a nicer output in case of sha1_object_info failures in ls-tree -l
From: Alex Riesen @ 2009-03-19 22:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jakub Narebski
In-Reply-To: <7vmybh6u15.fsf@gitster.siamese.dyndns.org>

An error message is already printed by sha1_object_info itself, and
the failed entries are additionally marked in the listing.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
---

Junio C Hamano, Thu, Mar 19, 2009 23:13:10 +0100:
> Alex Riesen <raa.lkml@gmail.com> writes:
> 
> > Junio C Hamano, Thu, Mar 19, 2009 22:55:56 +0100:
> >> Alex Riesen <raa.lkml@gmail.com> writes:
> >> > @@ -91,6 +90,7 @@ static int show_tree(const unsigned char *sha1, const char *base, int baselen,
> >> >  	if (!(ls_options & LS_NAME_ONLY)) {
> >> >  		if (ls_options & LS_SHOW_SIZE) {
> >> >  			if (!strcmp(type, blob_type)) {
> >> > +				unsigned long size = 0;
> >> >  				sha1_object_info(sha1, &size);
> >> >  				printf("%06o %s %s %7lu\t", mode, type,
> >> >  				       abbrev ? find_unique_abbrev(sha1, abbrev)
> >> 
> >> Hmm, shouldn't you be checking the return value from sha1_object_info()
> >> and skipping the printf() altogether instead?
> >
> > But then I cannot know the name of the failed tree entry.
> 
> Why?
> 
> 	if (sha1_object_info() == OBJ_BAD)
> 		die("object recorded at tree entry %s is bad", pathname);
> 	printf ...

Tried. Makes exactly this code much uglier, and the pathname is
printed nicely quoted after the outer if() is closed. And I don't like
the idea of dying here: it'll take longer to collect all the needed
entry names for later recovery (that's how it came to the change,
AFAIR).

How about this patch instead? I chose "BAD" for the marker, as any
automatic processing trying blindly to convert it into a number will
get a 0, which seems safe to me.

 builtin-ls-tree.c |   22 ++++++++++++----------
 1 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/builtin-ls-tree.c b/builtin-ls-tree.c
index fca4631..22008df 100644
--- a/builtin-ls-tree.c
+++ b/builtin-ls-tree.c
@@ -60,7 +60,6 @@ static int show_tree(const unsigned char *sha1, const char *base, int baselen,
 {
 	int retval = 0;
 	const char *type = blob_type;
-	unsigned long size;
 
 	if (S_ISGITLINK(mode)) {
 		/*
@@ -90,17 +89,20 @@ static int show_tree(const unsigned char *sha1, const char *base, int baselen,
 
 	if (!(ls_options & LS_NAME_ONLY)) {
 		if (ls_options & LS_SHOW_SIZE) {
+			char size_text[24];
 			if (!strcmp(type, blob_type)) {
-				sha1_object_info(sha1, &size);
-				printf("%06o %s %s %7lu\t", mode, type,
-				       abbrev ? find_unique_abbrev(sha1, abbrev)
-				              : sha1_to_hex(sha1),
-				       size);
+				unsigned long size;
+				if (sha1_object_info(sha1, &size) == OBJ_BAD)
+					strcpy(size_text, "BAD");
+				else
+					snprintf(size_text, sizeof(size_text),
+						 "%lu", size);
 			} else
-				printf("%06o %s %s %7c\t", mode, type,
-				       abbrev ? find_unique_abbrev(sha1, abbrev)
-				              : sha1_to_hex(sha1),
-				       '-');
+				strcpy(size_text, "-");
+			printf("%06o %s %s %7s\t", mode, type,
+			       abbrev ? find_unique_abbrev(sha1, abbrev)
+				      : sha1_to_hex(sha1),
+			       size_text);
 		} else
 			printf("%06o %s %s\t", mode, type,
 			       abbrev ? find_unique_abbrev(sha1, abbrev)
-- 
1.6.2.1.237.g7206c6

^ permalink raw reply related

* Re: [PATCH 00/11] Test on Windows - prequel
From: Junio C Hamano @ 2009-03-19 23:00 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git
In-Reply-To: <200903192158.46680.j6t@kdbg.org>

Johannes Sixt <j6t@kdbg.org> writes:

> On Mittwoch, 18. März 2009, Johannes Sixt wrote:
>> I'm preparing a series of patches that adjust the test suite so that it
>> passes on Windows (MinGW port). This is the initial part of it. Another
>> dozen or more are to follow. By splitting the series I hope to get
>> earlier feedback.
>>
>> The series is also available from
>>
>>  git://repo.or.cz/git/mingw/j6t.git for-junio
>>
> I've updated the series. Would you please pick up it up from the URL
> above?

I think you have a typo in the "Use 'say'" one ("In on case").  Here is a
diff from what I queued previously but didn't have a chance to merge to 'pu':

     test suite: Use 'say' to say something instead of 'test_expect_success'

-    Some test scripts report that some tests will be skipped.  They used
+    Some tests report that some tests will be skipped.  They used
     'test_expect_success' with a trivially successful test.  Nowadays we have
     the helper function 'say' for this purpose.

-    t9700-perl-git.sh was using 'say_color' for this kind of reporting; change
-    it to a vanilla 'say' for consistency.
+    In on case, 'say_color skip' is replaced by 'say' because the former is
+    not intended as a public API.

     Signed-off-by: Johannes Sixt <j6t@kdbg.org>
-    Signed-off-by: Junio C Hamano <gitster@pobox.com>

Other than that, the interdiff matches what I expected to see.

Thanks.

^ permalink raw reply

* Re: git-svn and incorrect working copy file timestamps?
From: Guido Ostkamp @ 2009-03-19 23:02 UTC (permalink / raw)
  To: git, derek.mahar

> I learned from http://marc.info/?l=git&m=122783905206964&w=2 that all 
> Git commands do not preserve file timestamps because Git, by design, 
> does not record timestamps in the tree objects.  So, in order to see the 
> last time a particular file changed, you must examine the commit log. 
> I guess I'll just have to get used to ignoring the working copy file 
> timestamps.

As far as I know setting the current time is required when switching 
between different named branches in the same repository.

It can happen that a branch switch ('checkout' in Git's terminology) 
retrieves an older version of a source file, and then the Makefile would 
not detect that an object file (a result from earlier compilation that is 
of course not stored in the repo itself) has to be rebuild because this is 
based on time checks only. In order to avoid this, the source file (even 
if older) gets the current date, so it is in any case newer than the 
object file and causes an automatic rebuild.

Regards

Guido

^ permalink raw reply

* Re: [PATCH] Produce a nicer output in case of sha1_object_info failures in ls-tree -l
From: Junio C Hamano @ 2009-03-19 23:08 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git, Jakub Narebski
In-Reply-To: <20090319225429.GC8433@blimp.localdomain>

Alex Riesen <raa.lkml@gmail.com> writes:

> How about this patch instead? I chose "BAD" for the marker, as any
> automatic processing trying blindly to convert it into a number will
> get a 0, which seems safe to me.

Such a broken automatic processing won't mind getting any garbage; the
choice among this patch, your original "say 0 when we do not know" patch,
or unpatched "size is undefined when an entry is corrupt" git wouldn't
make a whit of difference to it.

An automatic processing that does validate its input will notice BAD is
not a number, and can handle such a corrupt entry more sanely, which is
potentially a big plus.

I think this round is a big improvement.

>  builtin-ls-tree.c |   22 ++++++++++++----------
>  1 files changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/builtin-ls-tree.c b/builtin-ls-tree.c
> index fca4631..22008df 100644
> --- a/builtin-ls-tree.c
> +++ b/builtin-ls-tree.c
> @@ -60,7 +60,6 @@ static int show_tree(const unsigned char *sha1, const char *base, int baselen,
>  {
>  	int retval = 0;
>  	const char *type = blob_type;
> -	unsigned long size;
>  
>  	if (S_ISGITLINK(mode)) {
>  		/*
> @@ -90,17 +89,20 @@ static int show_tree(const unsigned char *sha1, const char *base, int baselen,
>  
>  	if (!(ls_options & LS_NAME_ONLY)) {
>  		if (ls_options & LS_SHOW_SIZE) {
> +			char size_text[24];
>  			if (!strcmp(type, blob_type)) {
> -				sha1_object_info(sha1, &size);
> -				printf("%06o %s %s %7lu\t", mode, type,
> -				       abbrev ? find_unique_abbrev(sha1, abbrev)
> -				              : sha1_to_hex(sha1),
> -				       size);
> +				unsigned long size;
> +				if (sha1_object_info(sha1, &size) == OBJ_BAD)
> +					strcpy(size_text, "BAD");
> +				else
> +					snprintf(size_text, sizeof(size_text),
> +						 "%lu", size);
>  			} else
> -				printf("%06o %s %s %7c\t", mode, type,
> -				       abbrev ? find_unique_abbrev(sha1, abbrev)
> -				              : sha1_to_hex(sha1),
> -				       '-');
> +				strcpy(size_text, "-");
> +			printf("%06o %s %s %7s\t", mode, type,
> +			       abbrev ? find_unique_abbrev(sha1, abbrev)
> +				      : sha1_to_hex(sha1),
> +			       size_text);
>  		} else
>  			printf("%06o %s %s\t", mode, type,
>  			       abbrev ? find_unique_abbrev(sha1, abbrev)
> -- 
> 1.6.2.1.237.g7206c6

^ permalink raw reply

* Re: t5505-remote fails on Windows
From: Johannes Schindelin @ 2009-03-19 23:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Johannes Sixt, Jay Soffian, Git Mailing List
In-Reply-To: <20090319200308.GB17028@coredump.intra.peff.net>

Hi,

On Thu, 19 Mar 2009, Jeff King wrote:

> On Thu, Mar 19, 2009 at 04:02:19AM -0700, Junio C Hamano wrote:
> 
> > > Do we really want an API for that?  Calling qsort() directly should be 
> > > obvious enough, no?
> > 
> > I think so.  If it were done like this (notice the lack of double
> > indirection in the cmp_fn signature):
> > 
> >     typedef int string_list_item_cmp_fn(const struct string_list_item *, const struct string_list_item *);
> > 
> >     void sort_string_list_with_fn(struct string_list *list, string_list_item_cmp_fn *);
> > 
> > it would have made more sense, though.
> 
> IIRC, that is actually not valid C according to the standard (that is, 
> even though a void* can be implicitly assigned to any other pointer, a 
> function taking a void* and a function taking another pointer do not 
> necessarily have the same function signature or calling conventions). 
> Which is why cmp_items in string-list.c already does the indirection.

AFAICT the idea was not to pass the function to qsort() directly, but I 
have to agree that I do not see how that should be possible with the 
current interface of qsort();

Ciao,
Dscho

^ permalink raw reply

* Re: Gnome chose Git
From: Johannes Schindelin @ 2009-03-19 23:17 UTC (permalink / raw)
  To: demerphq; +Cc: Shawn O. Pearce, Jeff King, Git
In-Reply-To: <9b18b3110903191451u56bbee7biac3a1fee4a36b71d@mail.gmail.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 812 bytes --]

Hi,

On Thu, 19 Mar 2009, demerphq wrote:

> 2009/3/19 Shawn O. Pearce <spearce@spearce.org>:
> > "Shawn O. Pearce" <spearce@spearce.org> wrote:
> >> demerphq <demerphq@gmail.com> wrote:
> >> > Outside of parsing the reflog directly, (which feels wrong and dirty
> >> > to me), how does one find out the times that a reflog entry was
> >> > created?
> >> >
> >> > The closest thing i could find was git log -g, but that shows the time
> >>
> >>   git reflog -g branch@{now}
> >
> > Arrgh, I of course actually meant
> >
> >    git log -g branch@{now}
> >
> >> the @{now} suffix is the magic to make it show the time.
> 
> Ah! Much nicer! Thanks.
> 
> Is there by any chance any way to set the date format it uses to
> something more suitable for machine processing?

git log --date=$FORMAT -g branch

Hth,
Dscho

^ permalink raw reply

* git-svn with multiple branch directories
From: Guido Ostkamp @ 2009-03-19 23:17 UTC (permalink / raw)
  To: git

Hello,

I am trying to create a git repo that tracks an SVN repo with multiple 
branch directories.

Is there any way to get this done easily?

It seems the 'git svn' command allows only to specify one 'trunk', 
'branches' and 'tag' directory.

The example usecase is the OpenOffice.org repo (it's just a private 
experiment). I got this svn-sync'ed within 4 evening sessions, the SVN 
size is about ~8 GB with ~270000 commits. Unfortunately their structure is

   branches/
   contrib/
   cws/
   dist/
   patches/
   tags/
   trunk/

where 'cws' and 'branches' both hold branches.

I have seen a web-based article telling one should

   git svn clone <URL>/trunk repo.git

first, and then hack the repo.git/.git/config file manually to add entries 
like

   [svn-remote "b1"]
         url = $SVN_REPO_URL/branches/b1
         fetch = :refs/remotes/b1
   [svn-remote "b2"]
         url = $SVN_REPO_URL/branches/b2
         fetch = :refs/remotes/b2
   [svn-remote "c1"]
         url = $SVN_REPO_URL/cws/c1
         fetch = :refs/remotes/c1
   ...

to later use

   git svn fetch <branchname>

for each branch. But even if that worked, their seems to be no easy way to 
detect newly created branches etc. Additionally, I get two entries listed 
in 'git branch' for each, one of which with extension '@1' (seems to point 
ot the branch point). This doesn't seem to be the case for repo's with 
only one branch directory converted the normal way.

Any ideas?

Best regards

Guido

^ permalink raw reply

* Re: Git Large Object Support Proposal
From: Scott Chacon @ 2009-03-19 23:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git list
In-Reply-To: <7veiwt6t6a.fsf@gitster.siamese.dyndns.org>

Hey,

On Thu, Mar 19, 2009 at 3:31 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Scott Chacon <schacon@gmail.com> writes:
>
>> But where Git instead stores a stub object and the large binary object
>> is pulled in via a separate mechanism. I was thinking that the client
>> could set a max file size and when a binary object larger than that is
>> staged, Git instead writes a stub blob like:
>>
>> ==
>> blob [size]\0
>> [sha of large blob]
>> ==
>
> An immediate pair of questions are, if you can solve the issue by
> delegating large media to somebody else (i.e. "media server"), and that
> somebody else can solve the issues you are having, (1) what happens if you
> lower that "large" threashold to "0 byte"?  Does that somebody else still
> work fine, and does the git that uses indirection also still work fine?
> If so why are you using git instead of that somebody else altogether?  and

In theory it would work fine, where all the commits/trees are
transferred over git and all the blobs are basically stored elsewhere,
but I would assume it would be much slower for the end user and so
nobody would do that.  I would imagine users would only use/enable
this at all if they have large media files that they don't want to
have every version of which cloned every time.  I can't imagine that
this would be used at all by more than a small percentage of users,
but when large media does need to be in source code, they will not use
Git (they will use Perforce or SVN), or they will put it in there and
then kill their (or our) servers when upload-pack tries to mmap it
(twice, yes?).  I thought it would be much more efficient for Git to
have the ability to simply mark files that don't make sense to pack up
and be able to keep track of and transfer them via a more appropriate
protocol.

> (2) what prevents us from stealing the trick that somebody else uses so
> that git itself can natively handle large blobs without indirection?
>

Actually, I'm fine with that - phase two of this project, if it made
sense at all, would be to have another set of git transfer commands
that allowed large blobs to be uploaded/downloaded separately,
importantly not passing them in the packfile and keeping them loose,
uncompressed and headerless on disk so they can simply be streamed
when requested.  I am thinking entirely about movies and images that
are already compressed and there is simply no need to load them
entirely into memory.  I simply thought that taking advantage of
services that already did this (scp, sftp, s3) would be quicker than
building another set of transfer protocols into Git.

> Without thinking the ramifications through myself, this sounds pretty much
> like a band-aid and will nend up hitting the same "blob is larger than we
> can handle" issue when you follow the indirection eventually, but that is
> just my gut feeling.

The point is that we don't keep this data as 'blob's - we don't try to
compress them or add the header to them, they're too big and already
compressed, it's a waste of time and often outside the memory
tolerance of many systems. We keep only the stub in our db and stream
the large media content directly to and from disk.  If we do a
'checkout' or something that would switch it out, we could store the
data in '.git/media' or the equivalent until it's uploaded elsewhere.

>
> This is an off-topic "By the way", but has another topic addressed to you
> on git-scm.com/about resolved in any way yet?
>

Thanks for pointing that out, I missed that thread.  I actually just
pushed out some changes over the last few days - I added the Gnome
project since they just announced they're moving to Git, added a link
to the new OReilly book that just was released and I pulled in some
validation and other misc changes that had been contributed.

Currently I have to re-gen the Authors data manually, so I do it every
once in a while - I just pushed up new data.  Doing it per release is
a good idea, I'll try to get that in the release script.

^ permalink raw reply

* Re: [PATCH 2/2] Allow http authentication via prompt for http push.
From: Amos King @ 2009-03-19 23:22 UTC (permalink / raw)
  To: Johannes Schindelin, git
In-Reply-To: <alpine.DEB.1.00.0903191755270.6357@intel-tinevez-2-302>

Sorry for the way I responded.  It was not very appropriate of me.  I
do think that if you would take a little tact in your approach that
you would keep developers trying to improve the code they are putting
into git, and trying to contribute more often.

Amos

On Thu, Mar 19, 2009 at 11:59 AM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Thu, 19 Mar 2009, Amos King wrote:
>
>> There is now a faux remote created in order to
>> be passed to http_init.
>>
>> Signed-off-by: Amos King <amos.l.king@gmail.com>
>> ---
>>  http-push.c |   11 ++++++++++-
>>  1 files changed, 10 insertions(+), 1 deletions(-)
>>
>> diff --git a/http-push.c b/http-push.c
>> index 9ac2664..468d5af 100644
>> --- a/http-push.c
>> +++ b/http-push.c
>> @@ -2195,7 +2195,16 @@ int main(int argc, char **argv)
>>
>>       memset(remote_dir_exists, -1, 256);
>>
>> -     http_init(NULL);
>> +     /*
>> +      * This is a faked remote so that http_init can
>> +      * get the correct data for builidng out athorization.
>> +      */
>
> You might want to pass this through aspell ;-)  Altough it will not
> suggest 'out ->our', I guess...
>
>> +     struct remote *remote;
>> +     remote = xcalloc(sizeof(*remote), 1);
>> +     ALLOC_GROW(remote->url, remote->url_nr + 1, remote->url_alloc);
>> +     remote->url[remote->url_nr++] = repo->url;
>> +
>> +     http_init(remote);
>
> Would 'fake' not be a more appropriate name than 'remote'?
>
> That would also make the patch 1/2 rather unnecessary (I also have to
> admit that I do not find 'repo' a better name, as we have a repository
> both locally and remotely, and this _is_ the remote repository, not the
> local one).
>
> Ciao,
> Dscho
>
>



-- 
Amos King
http://dirtyInformation.com
http://github.com/Adkron
--
Looking for something to do? Visit http://ImThere.com

^ permalink raw reply

* Re: [PATCH] Define a version of lstat(2) with posix semantics
From: Johannes Schindelin @ 2009-03-19 23:30 UTC (permalink / raw)
  To: Alex Riesen
  Cc: Git Mailing List, Johannes Sixt, Jeff King, layer, Junio C Hamano
In-Reply-To: <20090319214001.GA6253@blimp.localdomain>

Hi,

On Thu, 19 Mar 2009, Alex Riesen wrote:

> Johannes Schindelin, Thu, Mar 19, 2009 11:57:01 +0100:
> 
> > I'd rather just disable WIN32_STAT for Cygwin, because otherwise, we 
> > will keep running into issues.
> 
> I'd rather not. The thing is just so unbelievably slow and being stuck 
> on it I'm just trying my damnedest to squeeze every last bit of 
> performance out of it.

If you are serious about performance, you will not stay with Cygwin -- for 
the purposes of Git.

Do not get me wrong: Cygwin is a wonderful thing if your goal is to spare 
yourself a lot of trouble with that seriously challenged win32 API.

But if your goal is to get the most out of the Win32 API in terms of 
speed, you _will_ have to go with MinGW (at least, as long as you are 
unwilling to shell out big bucks in the vague direction of Redmond, and 
add some time tax to that).

Now, we _do_ have msysGit, you _do_ have shown the capability to fix 
issues when they arise, so I do _not_ see any obstacle why you should not 
go msysGit, rather than staying with the pain of trying to stay 
POSIX-compatible, but not quite all the time.

Ciao,
Dscho

^ permalink raw reply

* Re: Google Summer of Code 2009: GIT
From: Johannes Schindelin @ 2009-03-19 23:42 UTC (permalink / raw)
  To: saurabh gupta; +Cc: david, Junio C Hamano, git
In-Reply-To: <ab9fa62a0903191217t5d0e6d9cn4915a425ed8084ff@mail.gmail.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3554 bytes --]

Hi,

On Fri, 20 Mar 2009, saurabh gupta wrote:

> On Thu, Mar 19, 2009 at 4:46 AM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>
> > For example, if we decide that OOXML is a must (as it is a proper 
> > standard, and many people will benefit from it), we will most likely 
> > end up in having to write a merge _driver_ (to handle those .zip 
> > files), _and_ a merge _helper_, although we can avoid writing our own 
> > GUI, as we can create an OOXML that has its own version of conflict 
> > markers.
> 
> Well, for ODF type document, we can write a merge driver which will 
> change the xml file in an appropriate way that OO can understand it and 
> the user can see the merge result/conflict in a comfortable way. As 
> described by Junio, in this case, a dedicated merge helper is not needed 
> as OO can parse the markers made by merge-driver and provide the user to 
> resolve the conflict and register the changes to index.

There is also the idea that OOffice has building blocks in place to help 
resolving merge conflicts.  For a successful application, you will have to 
show that you researched that option, and describe how well/badly it fits 
with the goal of the project.

> > - knowing what data types we want to support _at the least_, and what 
> >   data  types we keep for the free skate,
> 
> As of now, how about going for XML files. For this summer, we can go for 
> XML files and latex files can be handled later.

If your goal is just XML files (without any more specific goal, like ODF 
or SVG), I am afraid that I think that project is not worth 4500 dollar 
from Google's pocket.  I mean, we are not talking peanuts here.

> > - a clear picture of the user interface we want to be able to provide,
> 
> In my opinion, we have following things to do:
> 
> => while merging an ODF document, merge-driver will merge the file at
> file level. If changes don't overlap, then it returns the result with
> a success. For example, if the file is changed only on one side, then
> the driver will simply add the new content.
> 
> => If conflicts appear, then the merge driver will put the markers in
> an appropriate manner which the end-user application (e.g. open
> office) can understand and show the user. For example, the XML file of
> that ODF document will be modified and OO can show it  to user in its
> way. We will have to study about the OO style of version marking.
> Another method is to implement the marker style in our own way. For
> example, to show any marker, the XML file is modified so that user can
> see markers like ">>>> " or "====" in openoffice....In this case, we
> will have to just change the xml content in this way.

That is correct, but I would appreciate a bit more definitive research 
_before_ the project proposal, as a sign that you are capable of working 
out the details of the project.

> > - a timeline (weekly milestones should be fine, I guess) what should 
> >   be  achieved when, and
> 
> Timeline can be decided once we reach some conclusion and the work which 
> needs to be done become clear to us.

Last year, most successful applications detailed a proposed timeline in 
their proposal...

Do not get me wrong, I want this project to succeed.

But on the other hand, I feel the obligation to be a bit demanding for the 
gracious donation of Google: we _do_ want to have something stunningly 
awesome at the end of the summer.

And that means that I have to get the impression from the student proposal 
that something like that is at least _possible_.

Ciao,
Dscho

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox