Git development
 help / color / mirror / Atom feed
* Re: Git benchmark - comparison with Bazaar, Darcs, Git and Mercurial
From: Linus Torvalds @ 2007-08-01  2:14 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Git Mailing List
In-Reply-To: <200708010216.59750.jnareb@gmail.com>



On Wed, 1 Aug 2007, Jakub Narebski wrote:
> 
> If I remember correctly there were some patches to git which tried to 
> better deal with large blobs. In this simple benchmark git was 
> outperformed by Mercurial and even Bazaar-NG a bit.

It's almost certainly not the binary blobs.

I think almost all the difference is from the cloning, without repacking 
the souce or using a local clone.

The default action for a git clone is to create a pack-file, and do a 
local clone as if you did it over the network. That is obviously much 
slower than using the "-l" flag for the _clone_ action, but it tends to be 
better for the end result - since you get a nice packed starting point, 
and none of the confusion with hardlinks etc.

[ Maybe I'm just a worry-wart, but hardlinking two repos still makes me 
  worried. Even though we never modify the object files. 

  Quite frankly, I almost wish we hadn't ever done "-l" at all, and I 
  cannot really suggest using it. Either use "-s" for the truly shared 
  repository, or use the default pack-generating one. The hardlinking one 
  was simple and made sense, but it's really not very nice.

  But that aversion to "git clone -l" is really totally illogical. The way 
  we do the object handling, hardlinking object files in git is just about 
  the most safe operation you can think of - and I *still* shudder at it ]

Now, I think the "always act as if you were network transparent" by 
default is great, but especially if you have never run "git gc" to 
generate a pack to begin with, it's going to be a very costly thing. And I 
think that's what the numbers show. That's the only op we do a *lot* worse 
on than we should.

(The "nonconflicting merge" is probably - once more - the diffstat 
generation that bites us. That's generally the most costly thing of the 
whole merge, but I *love* the diffstat).

That said, even if he had done a "git gc", to be fair he would have had to 
include the cost of that first garbage collect in the "initial import", so 
the end result would have been exactly the same. Git _does_ end up having 
a very odd performance profile, and while it's optimized for certain 
thing, the "initial import" is not one of them.

(Which admittedly is a bit odd. The reason I didn't ever seriously even 
consider monotone was that the initial import was so *incredibly* sucky, 
and took hours for the kernel. So use "-l" for benchmarks, and damn my 
"I hate hardlinking repos" idiocy).

So the only way to truly do a fast initial import *and* get a reasonably 
good initial clone is likely one of:

 - take full advantage of git, and use local branches, instead of 
   bothering with lots of clones.

   I think that this is often the right thing to do, but it's obviously 
   not fair for comparisons, since it's really something different from 
   what's likely available in the other SCM's. But it's the "git way".

 - use "git clone -s" (or "-l").

   I think the hg numbers are the result of hg defaulting to "-l" 
   behaviour.  Which makes sense for hg, since people need to clone more 
   (in git, you'd generally work with local branches instead).

 - or the initial import would be done with some "git fast-import" thing, 
   rather than "git add ." We don't do it now, and the resulting pack-file 
   wouldn't be optimal, but it would be reasonable. It would at least cut 
   down a _bit_ on the clone cost.

The other reaction I took away from that (quite reasonable, I think) 
comparison is that I think Murdock would have been much happier if git 
diff defaulted to "-C". We don't do that (for the best of reasons: 
interoperability), but maybe we should document the "-M/-C" options more. 

The options do show up in the man-page, but apparently not 
obviously enough, since he hadn't noticed.

			Linus

^ permalink raw reply

* Re: Git benchmark - comparison with Bazaar, Darcs, Git and Mercurial
From: Shawn O. Pearce @ 2007-08-01  2:17 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <200708010216.59750.jnareb@gmail.com>

Jakub Narebski <jnareb@gmail.com> wrote:
> I have lately added new Git speed benchmark, from Bryan Murdock blog. 
> The repository is bit untypical:
> 
> <quote>  
>   By performance, I mean that I used the UNIX time command to see how
>   long various basic operations took. Performing the various basic
>   operations gave me some insight into the usability of each as well.
>   For this test I used a directory with 266 MB of files, 258 KB of which
>   were text files, with the rest being image files. I know, kind of
>   weird to version all those binary files, but that was the project I
>   was interested in testing this out on. Your mileage may vary and all
>   that. Here’s a table summarizing the real times reported by time(1):
> </quote>
> 
> If I remember correctly there were some patches to git which tried to 
> better deal with large blobs. In this simple benchmark git was 
> outperformed by Mercurial and even Bazaar-NG a bit.

Yes.  And we backed them out more recently.  :-(

A while ago someone had issues with large binary blobs being added to
the repository as loose objects (e.g. by git-add/git-update-index).
Repacking that repository (for just git-gc or for transport/clone)
was ugly as the large binary blob had to be deflated then
reinflated to encode it in the packfile.  The solution was the
core.legacyheaders = false configuration setting, which used
packfile encoding for loose objects, thereby allowing the packer
to just copy the already compressed data into the output packfile.

Unfortunately we backed that out recently to "simplify the code".
We can still read that loose object format, but we cannot create
it and during packing we don't copy the data (we deflate/inflate
anyway).  So we're back to the horrible deflate/inflate problem.
That probably explains the large clone time seen by the author.

I wonder if hg realizes that the two repositories are on the
same filesystem and automatically uses hardlinks if possible (aka
git clone -l).  That would easily explain how they can clone so
dang fast.  Maybe we should do the same in git-clone, its a pretty
simple thing to do.


I do have to question the author's timing method.  I don't know if
this was hot-cache or not, and he doesn't say.  I don't know if the
system was 100% idle when running these times, or the times were
averaged over a few runs.  Usually the first run of anything can
give inaccurate timings, as for example the executable code may
not be paged in from disk.  One of the tools may have had a bias
as maybe he poked around with that tool first, before starting the
timings, so its executables were still hot in cache.  Etc.

However assuming everything was actually done in a way that the
timings can be accurately relied upon...

Regarding the initial file import it looks like we about broke even
with bzr if you add the "initial file import" and "initial commit"
times together.  Remember we have to hash and compress the data
during git-add; bzr probably delayed their equivilant operation(s)
until the commit operation.  Summing these two times is probably
needed to really compare them.

We were also rather close to hg if you again sum the times up.
But we do appear to be slower, by about 27s.  I guess I find that
hard to believe, but sure, maybe hg somehow has a faster codepath
for their file revision disk IO than we do.  Maybe its because hg
is streaming data and we're loading it all in-core first; maybe the
author's system had to swap get enough virtual memory for git-add.
Maybe it is just because the author's testing methodology was not
very good and one or more of these numbers are just bunk.


Our merge time is pretty respectible giving the competition.
Its probably within the margin of error of the author's testing
methodology.
 
-- 
Shawn.

^ permalink raw reply

* Re: dangling blob which is not dangling at all
From: Linus Torvalds @ 2007-08-01  2:22 UTC (permalink / raw)
  To: Domenico Andreoli; +Cc: git
In-Reply-To: <20070801013450.GA16498@raptus.dandreoli.com>



On Wed, 1 Aug 2007, Domenico Andreoli wrote:
> 
> $ git fsck --no-reflogs
> dangling blob e5d444e61b834c34710ce8fb5cb176e20e5894e1
>
> $ git-ls-tree 70b58535361eb633d44d4f1275af3421ca6a5ed7
> ...
> 100644 blob e5d444e61b834c34710ce8fb5cb176e20e5894e1    link_stream.c

Have you done clones with stupid protocols (rsync and/or http)?

The simplest explanation for this is that since you didn't do "--full" for 
fsck, then your git-fsck never looked into the pack-files you had. And the 
tree might well exist in a pack-file, and thus not even looked at by fsck.

So try "git fsck --full", and see if that changes the picture.

(Usually, you'd never have a pack-file *and* the loose object it points to 
both at the same time, but especially if you use the dumb transports 
(rsync and/or http), you'll get pack-files from remotes, and thus you 
won't have the normal nice behaviour of pack-files being "old state", and 
loose objects being "new state".

The easiest fixup is likely to just do "git gc", which which do a nice 
repack, and get rid of loose objects that are duplicates of stuff 
that is also in a pack-file.

		Linus

^ permalink raw reply

* Re: Git clone error
From: Linus Torvalds @ 2007-08-01  2:38 UTC (permalink / raw)
  To: Denis Bueno; +Cc: Git Mailing List
In-Reply-To: <C2D525CB.2A81%denbuen@sandia.gov>



On Tue, 31 Jul 2007, Denis Bueno wrote:
> 
> I'm a new git user (actually a darcs convert) and am running into a weird
> problem on a small and simple repository.  The error I see is:
> 
>     git[14] > git clone /Volumes/work/scripts/
>     Initialized empty Git repository in /tmp/git/scripts/.git/
>     remote: Generating pack...
>     Done counting 80 objects.
>     remote: error: unable to unpack b28b949a1a3c8eb37ca6eefd024508fa8b253429 header
>     fatal: unable to get type of object b28b949a1a3c8eb37ca6error: git-upload-pack: git-pack-objects died with error.
>     fatal: git-upload-pack: aborting due to possible repository corruption on the remote side.
>     remote: aborting due to possible repository corruption on the remote side.
>     fatal: early EOF
>     fatal: index-pack died with error code 128
>     fetch-pack from '/Volumes/work/scripts/.git' failed.

Well, it says so, but the most likely issue really is that there is 
"corruption on the remote side".

Please do

	cd /Volumes/work/scripts
	git fsck --full

which I would guess will almost certainly talk about some kind of problems 
with that object "b28b949a1a3c8eb37ca6eefd024508fa8b253429". And possibly 
more.

The "unable to unpack .. header" problem would at a guess be a totally 
corrupted loose object. You should have a file named

	.git/objects/b2/8b949a1a3c8eb37ca6eefd024508fa8b253429

and it sounds like that file is corrupt. So far, apart from a CRLF 
conversion bug (that you wouldn't have triggered on OS X anyway), I think 
every single time we've seen that, it's been a real disk or memory 
corruption.

Trying to restore corrupt objects can be quite hard, since git stores them 
in a compressed format, and so git single-bit errors are very detectable 
(that's kind of the point of having cryptographically secure hashes!), but 
they are very much not fixable, unless you can find the original data that 
*resulted* in that object some way (in another clone of the git 
repository, or in a backup).

There are certainly ways to figure out what that object _should_ be, 
though. For example, if that object is the only corrupted entry, and it's 
a blob (ie pure data object), what you can generally do is still use the 
repo (as long as you avoid that particular blob), and do things like

	git log --raw -r --no-abbrev

and you'll see the git history with all blobs named with their SHA1's. 
Then you can just search for that b28b949.. name, and you'll see exactly 
which file (in which version) it was, and if you can just find a backup of 
that file (or re-create it *exactly* from previous versions and your 
memory of it), you can re-generate the git object and thus save your 
repository.

Of course, a much easier option tends to be to just have a backup 
repository that has that object in it, and then you can literally just 
copy that b28b949 object over.

In short: git basically guarantees that it will *find* all corruption, but 
it doesn't do backups for you. Backups are easy to do (cloning), and git 
also makes incremental backups trivial (ie just do "git fetch" or "git 
push"), but git won't do backups for you unless you ask for them that way.

			Linus

^ permalink raw reply

* Re: [PATCH] Mention libiconv as a requirement for git-am
From: Han-Wen Nienhuys @ 2007-08-01  3:03 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git, Johannes Sixt
In-Reply-To: <20070731150948.GA9947@localhost>

libiconv is already in there, version 1.11,

[lilydev@haring gub]$ grep bin.libiconv
target/mingw/gubfiles/installer-git-master-repo.or.cz-git-mingw.git/files.txt
\usr\bin\libiconv-2.dll

I don't understand your request: do you want to have some flags added
to the environment?

2007/7/31, Nguyen Thai Ngoc Duy <pclouds@gmail.com>:
> ---
>  Han-Wen, any chance to include libiconv to the installer? You may need
>  to set NEEDS_ICONV, ICONVDIR and NO_ICONV properly to make git-am work.
>
>  README.MinGW |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/README.MinGW b/README.MinGW
> index 89b7065..c0b8f66 100644
> --- a/README.MinGW
> +++ b/README.MinGW
> @@ -28,6 +28,7 @@ In order to compile this code you need:
>         zlib-1.2.3-mingwPORT-1.tar
>         w32api-3.6.tar.gz
>         tcltk-8.4.1-1.exe (for gitk, git-gui)
> +       libiconv-1.9.2-1-{lib,bin}.zip (for git-am, from http://gnuwin32.sourceforge.net/packages/libiconv.htm)
>
>
>  STATUS
> --
> 1.5.0.7
>


-- 
Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

^ permalink raw reply

* Re: [PATCH] Mention libiconv as a requirement for git-am
From: Nguyen Thai Ngoc Duy @ 2007-08-01  3:18 UTC (permalink / raw)
  To: hanwen; +Cc: git, Johannes Sixt
In-Reply-To: <f329bf540707312003i60e910e9kf97d2f50fdecbed2@mail.gmail.com>

I'm sorry I judged by reading Makefile without actually testing your
installation, so I might be wrong. By default Makefile set
NO_ICONV=YesPlease on MinGW so it won't use libiconv even if it
exists. You need to unset NO_ICONV and set NEEDS_LIBICONV in
config.mak in order to enable it.

I'm going to test it tomorrow.

On 7/31/07, Han-Wen Nienhuys <hanwenn@gmail.com> wrote:
> libiconv is already in there, version 1.11,
>
> [lilydev@haring gub]$ grep bin.libiconv
> target/mingw/gubfiles/installer-git-master-repo.or.cz-git-mingw.git/files.txt
> \usr\bin\libiconv-2.dll
>
> I don't understand your request: do you want to have some flags added
> to the environment?
>
> 2007/7/31, Nguyen Thai Ngoc Duy <pclouds@gmail.com>:
> > ---
> >  Han-Wen, any chance to include libiconv to the installer? You may need
> >  to set NEEDS_ICONV, ICONVDIR and NO_ICONV properly to make git-am work.
> >
> >  README.MinGW |    1 +
> >  1 files changed, 1 insertions(+), 0 deletions(-)
> >
> > diff --git a/README.MinGW b/README.MinGW
> > index 89b7065..c0b8f66 100644
> > --- a/README.MinGW
> > +++ b/README.MinGW
> > @@ -28,6 +28,7 @@ In order to compile this code you need:
> >         zlib-1.2.3-mingwPORT-1.tar
> >         w32api-3.6.tar.gz
> >         tcltk-8.4.1-1.exe (for gitk, git-gui)
> > +       libiconv-1.9.2-1-{lib,bin}.zip (for git-am, from http://gnuwin32.sourceforge.net/packages/libiconv.htm)
> >
> >
> >  STATUS
> > --
> > 1.5.0.7
> >
>
>
> --
> Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen
>


-- 
Duy

^ permalink raw reply

* Re: [ANNOUNCE] GIT MinGW port updated to v1.5.3.rc2
From: Han-Wen Nienhuys @ 2007-08-01  3:26 UTC (permalink / raw)
  To: git; +Cc: git
In-Reply-To: <46A87F5D.C5BFC04B@eudaptics.com>

Johannes Sixt escreveu:
> I've just pushed an update of the MinGW port to:
> 
> clone:	git://repo.or.cz/git/mingw.git
> gitweb:	http://repo.or.cz/w/git/mingw.git
> 
> It is now at v1.5.3.rc2.
> 
> NOTE! This is only lightly tested, i.e. it passes (most of) the test
> suite(*), but it hasn't been used in production, yet.
> 

to match this beacon of robustness: I have put up 
a completely untested binary build for mingw at 

  http://lilypond.org/git/binaries/mingw/

-- 
 Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

^ permalink raw reply

* Re: [PATCH] Mention libiconv as a requirement for git-am
From: Han-Wen Nienhuys @ 2007-08-01  3:36 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git, Johannes Sixt
In-Reply-To: <fcaeb9bf0707312018p25297d76g50489fa303856dd6@mail.gmail.com>

2007/8/1, Nguyen Thai Ngoc Duy <pclouds@gmail.com>:
> I'm sorry I judged by reading Makefile without actually testing your
> installation, so I might be wrong. By default Makefile set
> NO_ICONV=YesPlease on MinGW so it won't use libiconv even if it
> exists. You need to unset NO_ICONV and set NEEDS_LIBICONV in
> config.mak in order to enable it.

this is done automatically when running configure. If you find a
problem with the installer please let me know.

> I'm going to test it tomorrow.



-- 
Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

^ permalink raw reply

* Re: [PATCH 1/4] Fix allocation of "int*" instead of "int".
From: Christian Couder @ 2007-08-01  4:11 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio Hamano, git
In-Reply-To: <20070731195616.GA6329@sigill.intra.peff.net>

Le mardi 31 juillet 2007 21:56, Jeff King a écrit :
> On Tue, Jul 31, 2007 at 02:48:29PM +0200, Christian Couder wrote:
> > -	weights = xcalloc(on_list, sizeof(int*));
> > +	weights = xcalloc(on_list, sizeof(int));
>
> How about the correct-by-definition sizeof(*weights)?

That's ok for me too.

Thanks,
Christian.

^ permalink raw reply

* Re: [PATCH 2/4] Add functions get_relative_cwd() and is_inside_dir()
From: Junio C Hamano @ 2007-08-01  4:22 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: gitster, git, matled
In-Reply-To: <Pine.LNX.4.64.0708010129090.14781@racer.site>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> The function get_relative_cwd() works just as getcwd(), only that it
> takes an absolute path as additional parameter, returning the prefix
> of the current working directory relative to the given path.  If the
> cwd is no subdirectory of the given path, it returns NULL.
> ...
> +/*
> + * get_relative_cwd() gets the prefix of the current working directory
> + * relative to 'dir'.  If we are not inside 'dir', it returns NULL.
> + * As a convenience, it also returns NULL if 'dir' is already NULL.
> + */
> +char *get_relative_cwd(char *buffer, int size, const char *dir)
> +{
> +	char *cwd = buffer;
> +
> +	if (!dir || !getcwd(buffer, size))
> +		return NULL;

When is it not a fatal error if get_relative_cwd() is called
with a NULL dir parameter, or getcwd() fails?

If there is no valid such cases, I would rather have this
die(), former with "BUG" and the latter with strerror(errno).

^ permalink raw reply

* [PATCH] make the name of the library directory a config option
From: Robert Schiele @ 2007-08-01  4:30 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

Introduce new makefile variable lib to hold the name of the lib
directory ("lib" by default).  Also introduce a switch for configure
to specify this name with --with-lib=ARG.  This is useful for systems
that use a different name than "lib" (like "lib64" on some 64 bit
Linux architectures).

Signed-off-by: Robert Schiele <rschiele@gmail.com>
---
You can also fetch this patch as e58a07b30a879228b9090a0c8ac6c690d77fcde8 from
git://schiele.dyndns.org/git

It requires that my zlib patch was applied before.

 Makefile     |   11 ++++++-----
 configure.ac |   11 +++++++++++
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index ca1247d..ff5fc5f 100644
--- a/Makefile
+++ b/Makefile
@@ -151,6 +151,7 @@ sysconfdir = /etc
 else
 sysconfdir = $(prefix)/etc
 endif
+lib = lib
 ETC_GITCONFIG = $(sysconfdir)/gitconfig
 # DESTDIR=
 
@@ -500,9 +501,9 @@ endif
 
 ifndef NO_CURL
 	ifdef CURLDIR
-		# Try "-Wl,-rpath=$(CURLDIR)/lib" in such a case.
+		# Try "-Wl,-rpath=$(CURLDIR)/$(lib)" in such a case.
 		BASIC_CFLAGS += -I$(CURLDIR)/include
-		CURL_LIBCURL = -L$(CURLDIR)/lib $(CC_LD_DYNPATH)$(CURLDIR)/lib -lcurl
+		CURL_LIBCURL = -L$(CURLDIR)/$(lib) $(CC_LD_DYNPATH)$(CURLDIR)/$(lib) -lcurl
 	else
 		CURL_LIBCURL = -lcurl
 	endif
@@ -520,7 +521,7 @@ endif
 
 ifdef ZLIB_PATH
 	BASIC_CFLAGS += -I$(ZLIB_PATH)/include
-	EXTLIBS += -L$(ZLIB_PATH)/lib $(CC_LD_DYNPATH)$(ZLIB_PATH)/lib
+	EXTLIBS += -L$(ZLIB_PATH)/$(lib) $(CC_LD_DYNPATH)$(ZLIB_PATH)/$(lib)
 endif
 EXTLIBS += -lz
 
@@ -528,7 +529,7 @@ ifndef NO_OPENSSL
 	OPENSSL_LIBSSL = -lssl
 	ifdef OPENSSLDIR
 		BASIC_CFLAGS += -I$(OPENSSLDIR)/include
-		OPENSSL_LINK = -L$(OPENSSLDIR)/lib $(CC_LD_DYNPATH)$(OPENSSLDIR)/lib
+		OPENSSL_LINK = -L$(OPENSSLDIR)/$(lib) $(CC_LD_DYNPATH)$(OPENSSLDIR)/$(lib)
 	else
 		OPENSSL_LINK =
 	endif
@@ -545,7 +546,7 @@ endif
 ifdef NEEDS_LIBICONV
 	ifdef ICONVDIR
 		BASIC_CFLAGS += -I$(ICONVDIR)/include
-		ICONV_LINK = -L$(ICONVDIR)/lib $(CC_LD_DYNPATH)$(ICONVDIR)/lib
+		ICONV_LINK = -L$(ICONVDIR)/$(lib) $(CC_LD_DYNPATH)$(ICONVDIR)/$(lib)
 	else
 		ICONV_LINK =
 	endif
diff --git a/configure.ac b/configure.ac
index b2f1965..84fd7f1 100644
--- a/configure.ac
+++ b/configure.ac
@@ -69,6 +69,17 @@ fi \
 ## Site configuration related to programs (before tests)
 ## --with-PACKAGE[=ARG] and --without-PACKAGE
 #
+# Set lib to alternative name of lib directory (e.g. lib64)
+AC_ARG_WITH([lib],
+ [AS_HELP_STRING([--with-lib=ARG],
+                 [ARG specifies alternative name for lib directory])],
+ [if test "$withval" = "no" -o "$withval" = "yes"; then \
+	AC_MSG_WARN([You should provide name for --with-lib=ARG]); \
+else \
+	GIT_CONF_APPEND_LINE(lib=$withval); \
+fi; \
+],[])
+#
 # Define SHELL_PATH to provide path to shell.
 GIT_ARG_SET_PATH(shell)
 #
-- 
1.5.2.3

^ permalink raw reply related

* Re: MinGW build environment for Git
From: Marius Storm-Olsen @ 2007-08-01  5:15 UTC (permalink / raw)
  To: Dmitry Kakurin; +Cc: git
In-Reply-To: <a1bbc6950707291614w392bf3a9t5d0d9e50bfcb0f36@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 955 bytes --]

Dmitry Kakurin said the following on 30.07.2007 01:14:
> I want to be able to build MinGW port of Git on Windows. I've tried
> to follow steps in README.MinGW to setup this environment myself
> (install MinGW, MSys, ZLib etc.) but after wasting a lot of time
> with no result I give up. So, could somebody please just pkzip
> their environment (everything required) and share the zip file with
> me (privately or publicly)? I also think that an even better idea
> is to create a new Git repository with MinGW build environment.
> This will make contributing to MinGW port of Git MUCH easier.

Hi Dmitry,

Aaron has done this, and you can find the link on his blog, here:
     http://www.ekips.org/cgi-bin/aaron.cgi/2007/02/27

The archive's compressed with 7zip, so you'll need that to decompress 
it. (http://www.7-zip.org/)

Direct link to the archive:
     http://inkscape.modevia.com/git/mingw4git.7z

Good luck!
-- 
.marius


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 187 bytes --]

^ permalink raw reply

* Re: [PATCH 4/4] Clean up work-tree handling
From: Junio C Hamano @ 2007-08-01  5:17 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, matled
In-Reply-To: <Pine.LNX.4.64.0708010129530.14781@racer.site>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> The old version of work-tree support was an unholy mess, barely readable,
> and not to the point.
>
> For example, why do you have to provide a worktree, when it is not used?
> As in "git status".  Now it works.
> ...

Without continuing with negatives, let's try to define the new,
corrected world order.

I do not think the following is exactly what your cleaned-up
version tries to perform, but I am writing this down primarily
to demonstrate the style and the level of detail I expect to
accompany a clean-up patch like this.

----------------------------------------------------------------

Definitions:

 - You can have "checked out" files on the filesystem, and such
   files are said to be in your "work tree".  The directory
   on the filesystem that corresponds to the toplevel entries of
   the index and the tree objects directly contained in the
   commit objects is called "the toplevel of the work tree",
   or simply "work tree" if it is not ambiguous from the
   context.

 - The directory that holds git repository information is called
   "git directory".  This is typically .git directory at the
   toplevel of your work tree, but not necessarily so.

 - You can perform many git operations without a work tree, but
   some operations fundamentally require you to have one
   (e.g. checkout and diff, unless two tree-ishes are given, do
   not make sense without a work tree).

There are four predicates, two interrogators, and two
manipulators:

 - is_inside_git_dir(): this returns true if the $cwd is the git
   directory or its subdirectory. [IS THIS STILL NEEDED???]

 - is_inside_work_tree(): this returns true if the $cwd is
   inside work tree (i.e. either at the toplevel of the work
   tree or its subdirectory).  [NEEDSHELP: is .git in the usual
   layout considered "is_inside_work_tree()"?  Should it?]

 - is_bare_repository(): this returns true if no work tree is
   found.  There is a corresponding function usable from the
   scripts.

 - require_work_tree (shell): this is called by scripts that
   needs to have a work tree to operate, and barfs otherwise.

 - get_git_dir(): this returns the location of the git
   directory.  With GIT_DIR environment variable, or --git-dir
   command line option, you can tell git to use a specific
   directory as the git directory.  Otherwise a directory that
   looks like a git directory and whose name is .git is looked
   for, in the $cwd or its parent directory.  If there is no
   such directory, and if the $cwd looks like a git directory,
   $cwd is the git directory.

 - get_git_work_tree(): this returns the location of the work
   tree; it returns NULL if there is none.  The command line
   option --work-tree, or the environment variable GIT_WORK_TREE
   can specify the location; otherwise, git directory is looked
   for so that its configuration file can be read.  If
   core.worktree is there, that specifies the location.
   Otherwise, if the basename of the git directory is .git, it
   is the parent directory of the git directory.  Otherwise, you
   do not have work tree.

 - set_git_dir(): used to set the git directory, internally to
   handle --git-dir option;

 - set_work_tree(): used to set the git directory, internally to
   handle --work-tree option;

----------------------------------------------------------------

After writing the above down, it strikes me odd that we do not
have a predicate that says "we know the work tree is there".

If a command wants a work tree, and if you are outside the work
tree, then is_inside_work_tree() returns false and
get_git_work_tree() returns non NULL, so that is a good pair of
interface that can be mixed and matched (e.g. you can chdir to
the former to perform the whole tree operation, or refuse to
perform, based on is_inside_work_tree being false, cwd relative
operations).

^ permalink raw reply

* Re: [PATCH 2/4] Add functions get_relative_cwd() and is_inside_dir()
From: Junio C Hamano @ 2007-08-01  5:35 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, matled
In-Reply-To: <7vy7gvdgtn.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <gitster@pobox.com> writes:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>> The function get_relative_cwd() works just as getcwd(), only that it
>> takes an absolute path as additional parameter, returning the prefix
>> of the current working directory relative to the given path.  If the
>> cwd is no subdirectory of the given path, it returns NULL.
>> ...
>> +/*
>> + * get_relative_cwd() gets the prefix of the current working directory
>> + * relative to 'dir'.  If we are not inside 'dir', it returns NULL.
>> + * As a convenience, it also returns NULL if 'dir' is already NULL.
>> + */
>> +char *get_relative_cwd(char *buffer, int size, const char *dir)
>> +{
>> +	char *cwd = buffer;
>> +
>> +	if (!dir || !getcwd(buffer, size))
>> +		return NULL;
>
> When is it not a fatal error if get_relative_cwd() is called
> with a NULL dir parameter, or getcwd() fails?
>
> If there is no valid such cases, I would rather have this
> die(), former with "BUG" and the latter with strerror(errno).

Heh, it turns out that there is this lazy or clever (depending
on the viewpoint) caller that passes the return value of
get_git_work_tree() to this function and expect this to return
NULL when no work tree is found.

The callers of the is_* functions are much cleaner and in that
sense the series is a definite improvement, but this one
particular obscurity makes me wonder if it is replacing one
unholy mess with a smaller but still unholy mess.

Will apply on "master" and will be part of -rc4, but we probably
would want to have a longer pre-final freeze than usual to
really make sure this one is good.

^ permalink raw reply

* Re: [PATCH] make the name of the library directory a config option
From: Junio C Hamano @ 2007-08-01  5:36 UTC (permalink / raw)
  To: Robert Schiele; +Cc: git
In-Reply-To: <20070801043035.GH29424@schiele.dyndns.org>

The patch is obvious and trivial so I'll swallow, but as a
general rule I'd like to avoid changes to the build procedure
this late in the game.

Thanks for the patch, will apply.

^ permalink raw reply

* Re: Git benchmark - comparison with Bazaar, Darcs, Git and Mercurial
From: Junio C Hamano @ 2007-08-01  5:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, Git Mailing List
In-Reply-To: <alpine.LFD.0.999.0707311850220.4161@woody.linux-foundation.org>

Linus Torvalds <torvalds@linux-foundation.org> writes:

> (Which admittedly is a bit odd. The reason I didn't ever seriously even 
> consider monotone was that the initial import was so *incredibly* sucky, 
> and took hours for the kernel. So use "-l" for benchmarks, and damn my 
> "I hate hardlinking repos" idiocy).

I would call aversion to -l a superstition, while aversion to -s
has a sound technical reasons.  The latter means you need to know
what you are doing --- namely, you are making the clone still
dependent on the original.

^ permalink raw reply

* Re: dangling blob which is not dangling at all
From: Domenico Andreoli @ 2007-08-01  6:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <alpine.LFD.0.999.0707311914570.4161@woody.linux-foundation.org>

On Tue, Jul 31, 2007 at 07:22:14PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 1 Aug 2007, Domenico Andreoli wrote:
> > 
> > $ git fsck --no-reflogs
> > dangling blob e5d444e61b834c34710ce8fb5cb176e20e5894e1
> >
> > $ git-ls-tree 70b58535361eb633d44d4f1275af3421ca6a5ed7
> > ...
> > 100644 blob e5d444e61b834c34710ce8fb5cb176e20e5894e1    link_stream.c
> 
> Have you done clones with stupid protocols (rsync and/or http)?

I do not remember having used any dump transport on this repository but
I recall having tried git-repack with the intent of git gc.

> So try "git fsck --full", and see if that changes the picture.

This did not change anything.

> The easiest fixup is likely to just do "git gc", which which do a nice 
> repack, and get rid of loose objects that are duplicates of stuff 
> that is also in a pack-file.

This fixed things and also warned about two heads referring to pruned
commits, which may be those two commits I removed by hand (I hope).

Cheers,
Domenico

-----[ Domenico Andreoli, aka cavok
 --[ http://www.dandreoli.com/gpgkey.asc
   ---[ 3A0F 2F80 F79C 678A 8936  4FEE 0677 9033 A20E BC50

^ permalink raw reply

* Re: dangling blob which is not dangling at all
From: Junio C Hamano @ 2007-08-01  7:27 UTC (permalink / raw)
  To: Domenico Andreoli; +Cc: Linus Torvalds, git
In-Reply-To: <20070801063209.GA13511@raptus.dandreoli.com>

Domenico Andreoli <cavokz@gmail.com> writes:

> This fixed things and also warned about two heads referring to pruned
> commits, which may be those two commits I removed by hand (I hope).

Exactly.

All refs under .git/refs (the special case of this includes the
branch heads in .git/refs/heads) are your _promise_ to git that
everything that is reachable from them are supposed to be
available in your repository.  If you remove specific commits by
hand without adjusting the branch ref, you are breaking that
promise and git-fsck will notice it as a repository breakage.

If you do not need a branch and everything reachable only from
that branch, you can remove that branch (with "git branch -D"),
and run git-gc, which internally does the same reachability
analysis as git-fsck does and gets rid of objects that are no
longer necessary.

^ permalink raw reply

* Re: dangling blob which is not dangling at all
From: Domenico Andreoli @ 2007-08-01  7:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vhcnjbtpt.fsf@assigned-by-dhcp.cox.net>

On Wed, Aug 01, 2007 at 12:27:10AM -0700, Junio C Hamano wrote:
> Domenico Andreoli <cavokz@gmail.com> writes:
> 
> > This fixed things and also warned about two heads referring to pruned
> > commits, which may be those two commits I removed by hand (I hope).
> 
> Exactly.
> 
> All refs under .git/refs (the special case of this includes the
> branch heads in .git/refs/heads) are your _promise_ to git that
> everything that is reachable from them are supposed to be
> available in your repository.  If you remove specific commits by
> hand without adjusting the branch ref, you are breaking that
> promise and git-fsck will notice it as a repository breakage.

If I move any ref by hand (not that I pass the day doing this..), I
understand that some commits may suddenly result as unreachable. But
those commits I removed by hand were already unreachable so no refs
should have been referring them.

What is this reflog thing and why is required?

Domenico

-----[ Domenico Andreoli, aka cavok
 --[ http://www.dandreoli.com/gpgkey.asc
   ---[ 3A0F 2F80 F79C 678A 8936  4FEE 0677 9033 A20E BC50

^ permalink raw reply

* Re: Git benchmark - comparison with Bazaar, Darcs, Git and Mercurial
From: Jakub Narebski @ 2007-08-01  8:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <alpine.LFD.0.999.0707311850220.4161@woody.linux-foundation.org>

Linus Torvalds wrote:

> (The "nonconflicting merge" is probably - once more - the diffstat 
> generation that bites us. That's generally the most costly thing of the 
> whole merge, but I *love* the diffstat).

http://bryan-murdock.blogspot.com/2007/03/cutting-edge-revision-control.html
doesn't tell what is the directory structure of imported files.
If it is flat, then git does not use advantage of hierarchical tree
structure.

By the way, I guess that "nonconflicting merge" is trivial tree-level
merge, as "no changes" merge should be faster (or fast-forward).

About clone: there was "pack loose, copy existing packs" idea. I don't
remember what happened with it. At least for local clone it would be
nice.

-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: dangling blob which is not dangling at all
From: Steven Grimm @ 2007-08-01  8:35 UTC (permalink / raw)
  To: Domenico Andreoli; +Cc: Junio C Hamano, git
In-Reply-To: <20070801074237.GA14790@raptus.dandreoli.com>

Domenico Andreoli wrote:
> What is this reflog thing and why is required?
>   

It is a log of where each ref pointed at any given time. Or rather, a 
log of changes to refs, with timestamps. It is not *required* per se 
(you can turn it off and almost all of git will continue to work as 
before) but it's handy in that you can say stuff like

git checkout -b newbranch master@"{4 days ago}"

and git will give you a new branch pointing at the rev that master 
pointed to 4 days ago, even if it's a rev that is no longer reachable 
from any of the existing heads (e.g., because you did a "git rebase" and 
the rev in question was replaced by a new one.) Obviously as soon as you 
do a "git gc" you will lose the ability to go back to unreachable revs 
using the reflog.

I primarily use the reflog to undo rebase operations. Not that I need to 
do that very often, but it's occasionally handy, e.g., if there was a 
conflict and I made a mistake while resolving it.

-Steve

^ permalink raw reply

* Re: Git benchmark - comparison with Bazaar, Darcs, Git and Mercurial
From: Junio C Hamano @ 2007-08-01  8:48 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <200708011033.00873.jnareb@gmail.com>

Jakub Narebski <jnareb@gmail.com> writes:

> About clone: there was "pack loose, copy existing packs" idea.

Can you give more details --- I do not recall such an "idea"
discussed.

^ permalink raw reply

* Re: Git benchmark - comparison with Bazaar, Darcs, Git and Mercurial
From: David Kastrup @ 2007-08-01  8:48 UTC (permalink / raw)
  To: git
In-Reply-To: <7vodhrby6f.fsf@assigned-by-dhcp.cox.net>


Junio C Hamano <gitster@pobox.com> writes:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> (Which admittedly is a bit odd. The reason I didn't ever seriously even 
>> consider monotone was that the initial import was so *incredibly* sucky, 
>> and took hours for the kernel. So use "-l" for benchmarks, and damn my 
>> "I hate hardlinking repos" idiocy).
>
> I would call aversion to -l a superstition, while aversion to -s
> has a sound technical reasons.  The latter means you need to know
> what you are doing --- namely, you are making the clone still
> dependent on the original.

Well, I'd not call the -l aversy a complete superstition: it means
that cloning a repository won't provide any redundancy worth noting
against file system corruption.

-- 
David Kastrup

^ permalink raw reply

* Re: [PATCH 4/4] Clean up work-tree handling
From: Junio C Hamano @ 2007-08-01  8:59 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: gitster, git, matled
In-Reply-To: <Pine.LNX.4.64.0708010129530.14781@racer.site>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> diff --git a/builtin-ls-files.c b/builtin-ls-files.c
> index 61577ea..d36181a 100644
> --- a/builtin-ls-files.c
> +++ b/builtin-ls-files.c
> @@ -469,9 +469,11 @@ int cmd_ls_files(int argc, const char **argv, const char *prefix)
>  		break;
>  	}
>  
> -	if (require_work_tree &&
> -			(!is_inside_work_tree() || is_inside_git_dir()))
> -		die("This operation must be run in a work tree");
> +	if (require_work_tree && !is_inside_work_tree()) {
> +		const char *work_tree = get_git_work_tree();
> +		if (!work_tree || chdir(work_tree))
> +			die("This operation must be run in a work tree");
> +	}
>  
>  	pathspec = get_pathspec(prefix, argv + i);
>  

Similarly to this change, I am wondering if we would want to fix
verify_non_filename() in setup.c, which does this:

/*
 * Verify a filename that we got as an argument for a pathspec
 * entry. Note that a filename that begins with "-" never verifies
 * as true, because even if such a filename were to exist, we want
 * it to be preceded by the "--" marker (or we want the user to
 * use a format like "./-filename")
 */
void verify_filename(const char *prefix, const char *arg)
{
...
}

/*
 * Opposite of the above: the command line did not have -- marker
 * and we parsed the arg as a refname.  It should not be interpretable
 * as a filename.
 */
void verify_non_filename(const char *prefix, const char *arg)
{
        const char *name;
        struct stat st;

        if (!is_inside_work_tree() || is_inside_git_dir())
                return;
        if (*arg == '-')
                return; /* flag */
        name = prefix ? prefix_filename(prefix, strlen(prefix), arg) : arg;
        if (!lstat(name, &st))
                die("ambiguous argument '%s': both revision and filename\n"
                    "Use '--' to separate filenames from revisions", arg);
        if (errno != ENOENT)
                die("'%s': %s", arg, strerror(errno));
}

At this point, we are given an ambiguous parameter, that could
be naming a path in the work tree.  If we are not in the work
tree, then it is understandable that we do not have to barf.
The other check (i.e. "|| is_inside_git_dir()") does not hurt
(iow, it is not an incorrect check per-se), because if you did
"cd .git && git log HEAD" then the HEAD parameter cannot be
naming the path ".git/HEAD" in the work tree, but (1) that is
already covered by .git/ being "outside of work tree", and (2)
it is not something this function wants to check anyway
(i.e. "can the parameter be naming a file in the work tree?").

Am I mistaken and/or confused?

^ permalink raw reply

* Re: MinGW build environment for Git
From: Dmitry Kakurin @ 2007-08-01  9:08 UTC (permalink / raw)
  To: Marius Storm-Olsen; +Cc: git
In-Reply-To: <46B016FC.4050005@trolltech.com>

On 7/31/07, Marius Storm-Olsen <marius@trolltech.com> wrote:
> Dmitry Kakurin said the following on 30.07.2007 01:14:
> > I want to be able to build MinGW port of Git on Windows. I've tried
> > to follow steps in README.MinGW to setup this environment myself
> > (install MinGW, MSys, ZLib etc.) but after wasting a lot of time
> > with no result I give up. So, could somebody please just pkzip
> > their environment (everything required) and share the zip file with
> > me (privately or publicly)? I also think that an even better idea
> > is to create a new Git repository with MinGW build environment.
> > This will make contributing to MinGW port of Git MUCH easier.
>
> Aaron has done this, and you can find the link on his blog, here:
>     http://www.ekips.org/cgi-bin/aaron.cgi/2007/02/27

I've downloaded and installed it. But I could not make it work :-(.
First I had this problem:
$ make
GIT_VERSION = 1.5.3.GIT
    * new build flags or prefix
    CC convert-objects.o
gcc.exe: installation problem, cannot exec `cc1': No such file or directory
make: *** [convert-objects.o] Error 1

Then I've copied cc1.exe and some others from
C:\mingw4git\libexec\gcc\mingw32 into /bin.

$ make
    CC convert-objects.o
In file included from cache.h:4,
                 from convert-objects.c:1:
git-compat-util.h:51:22: sys/wait.h: No such file or directory

Searching entire (downloaded) tree for wait.h gives nothing.

'make configure' does not work:
$ make configure
    GEN configure
configure.ac+:4: error: Autoconf version 2.59 or higher is required
configure.ac+:4: the top level
autom4te: /bin/m4 failed with exit status: 1
make: *** [configure] Error 1

What do I do now?

-Dmitry

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox