Git development

Git development
 help / color / mirror / Atom feed

* Re: git-clone --how-much-disk-space-will-this-cost-me? [--depth n]
From: Shawn O. Pearce @ 2008-12-17 15:44 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jean-Luc Herren, jidanni, git
In-Reply-To: <alpine.LFD.2.00.0812160039180.30035@xanadu.home>

Nicolas Pitre <nico@cam.org> wrote:
> On Tue, 16 Dec 2008, Jean-Luc Herren wrote:
> > jidanni@jidanni.org wrote:
> > > JH> So maybe what you really want is an ETA display during the cloning
> > > JH> process?  Sounds like a good idea to me.
> 
> And then you'll end up being the unlucky bastard to be the first to 
> clones the new latest revision of a repository, and ETA won't be 
> available, and you'll complain about the fact that sometimes it is there 
> and sometimes it is not.
> 
> The fact is, fundamentally, we don't know how many bytes to push when 
> generating a pack to answer the clone request.  Sometimes we _could_ but 
> not always.  It is therefore better to be consistent and let people know 
> that there is simply no ETA.

Hmm.

What if on an initial clone (no "have" lines received) we sum up
the sizes of the *.pack and all of the loose objects and sent
that as an initial size estimate.  Its going to be the upper bound
of the final pack that we send.  At worst it over-estimates on the
size and download finishes faster.

I'm willing to bet that most of the "big" repositories out there
don't have a lot of garbage in them.  Linus' kernel repository
doesn't rewind, so he has 0 garbage.  Anyone cloning from him would
get a reasonable estimate.  Likewise with a Gentoo/KDE/WebKit/gcc
sort of giant tree most of that is in a huge historical pack.
That one pack file alone is completely reachable and dominates the
transfer size.

On smaller trees where people may have a lot of rebase garbage or
everything is loose the estimate will be quite a bit above what we
transfer, but how much so that it matters?

Yea, a single stray binary of some *.mpg or *.iso accidentally
added and then removed (and now unreachable) will vastly inflate
the numbers.  In which case the repository owner will be encouraged
to prune when people won't clone his estimated 8 GiB download,
which is actually only 1 MiB.

-- 
Shawn.

^ permalink raw reply

* Re: [PATCH] git-gui: Get rid of the last remnants of GIT_CONFIG_LOCAL
From: Shawn O. Pearce @ 2008-12-17 15:53 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <alpine.DEB.1.00.0812142311360.25197@racer>

Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> 
> In dc871831(Only use GIT_CONFIG in "git config", not other programs),
> GIT_CONFIG_LOCAL was rested in peace, in favor of not reading
> /etc/gitconfig and $HOME/.gitconfig at all when GIT_CONFIG is set.
> 
> 	You might want to hold off on that, as git-gui is supposed to be 
> 	less tightly coupled to git.git than, say, git-svn.
> 
> 	The commit removing support for GIT_CONFIG_LOCAL is from Mon Jun 
> 	30 03:37:47 2008 -0400...

Nah, that's in Git 1.6.0.  This block of code is only useful on
Windows, where most people are probably running a more msysgit
style system than some random mashup of Git and git-gui versions.

Plus, wasn't there just some discussion about how this warning
is perhaps harmful to the user because it doesn't really help
her to resolve the problem?

I applied it anyway.  Its quite unlikely anyone runs git-gui on
Windows with GIT_CONFIG_LOCAL in their environment.

-- 
Shawn.

^ permalink raw reply

* Re: [EGIT PATCH 1/2] Revert "Fix commit id in egit test T0001_ConnectProviderOperationTest"
From: Shawn O. Pearce @ 2008-12-17 16:09 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: git
In-Reply-To: <1229472439-24104-2-git-send-email-robin.rosenberg@dewire.com>

Robin Rosenberg <robin.rosenberg@dewire.com> wrote:
> This reverts commit 61133091d5f22398828b350ff772165e9945db8a.
> 
> Bisect says this is the commit that failed, which is odd. Bad QA.

Hmmph.  This revert fails here.

> diff --git a/org.spearce.egit.core.test/src/org/spearce/egit/core/op/T0001_ConnectProviderOperationTest.java b/org.spearce.egit.core.test/src/org/spearce/egit/core/op/T0001_ConnectProviderOperationTest.java
> index 0ce2d7f..aae1ef4 100644
> --- a/org.spearce.egit.core.test/src/org/spearce/egit/core/op/T0001_ConnectProviderOperationTest.java
> +++ b/org.spearce.egit.core.test/src/org/spearce/egit/core/op/T0001_ConnectProviderOperationTest.java
> @@ -99,8 +99,9 @@ assertTrue("tree missing", new File(gitDir,
>  				"objects/08/ccc3d91a14d337a45f355d3d63bd97fd5e4db9").exists());
>  		assertTrue("tree missing", new File(gitDir,
>  				"objects/9d/aeec817090098f05eeca858e3a552d78b0a346").exists());
> +
>  		assertTrue("commit missing", new File(gitDir,
> -				"objects/09/6f1a84091b90b6d9fb12f95848da69496305c1").exists());
> +				"objects/45/df73fd9abbc2c61620c036948b1157e4d21253").exists());

Debugging this test shows that the commit we created in the test
is actually:

--
$ git cat-file commit 4c1bc1435f93c9409c93db5239e111271a8ccf55
tree 9daeec817090098f05eeca858e3a552d78b0a346
author J. Git <j.git@egit.org> 60876086400 +0100
committer J. Git <j.git@egit.org> 60876086400 +0100

testNewUnsharedFile

Junit tests
--

Which has me starting to wonder, what the heck is different
between systems that is being reflected in this commit object?
The only thing I can think of is the timestamp we are creating by
the deprecated Date constructor call back on line 82.  Perhaps
on different JVMs it is using different values for the hh:mm:ss
parts of the timestamp value?

-- 
Shawn.

^ permalink raw reply

* Re: git-clone --how-much-disk-space-will-this-cost-me? [--depth n]
From: Nicolas Pitre @ 2008-12-17 16:15 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Jean-Luc Herren, jidanni, git
In-Reply-To: <20081217154407.GZ32487@spearce.org>

On Wed, 17 Dec 2008, Shawn O. Pearce wrote:

> Nicolas Pitre <nico@cam.org> wrote:
> > The fact is, fundamentally, we don't know how many bytes to push when 
> > generating a pack to answer the clone request.  Sometimes we _could_ but 
> > not always.  It is therefore better to be consistent and let people know 
> > that there is simply no ETA.
> 
> Hmm.
> 
> What if on an initial clone (no "have" lines received) we sum up
> the sizes of the *.pack and all of the loose objects and sent
> that as an initial size estimate.  Its going to be the upper bound
> of the final pack that we send.  At worst it over-estimates on the
> size and download finishes faster.

It is a kludge.  It makes the system imprecise for little benefit. Once 
you start adding kludges like that into your system, people will always 
ask for more kludges, and in the end your system isn't as reliable.  We 
all know about some other operating system which was designed like that. 
I personally don't want to go there.

> Yea, a single stray binary of some *.mpg or *.iso accidentally
> added and then removed (and now unreachable) will vastly inflate
> the numbers.  In which case the repository owner will be encouraged
> to prune when people won't clone his estimated 8 GiB download,
> which is actually only 1 MiB.

And I consider any system doing such thing completely stupid.  Either 
you consistently know the information or you don't.  When you don't, it 
is best to not create expectations for the user.  And so far I think 
that 99.9% of git users are just fine with the progress display we 
currently provide.

Nicolas

^ permalink raw reply

* Re: git-clone --how-much-disk-space-will-this-cost-me? [--depth n]
From: Shawn O. Pearce @ 2008-12-17 16:21 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jean-Luc Herren, jidanni, git
In-Reply-To: <alpine.LFD.2.00.0812171104340.30035@xanadu.home>

Nicolas Pitre <nico@cam.org> wrote:
> 
> And I consider any system doing such thing completely stupid.  Either 
> you consistently know the information or you don't.  When you don't, it 
> is best to not create expectations for the user.  And so far I think 
> that 99.9% of git users are just fine with the progress display we 
> currently provide.

Certainly true here; I never care how big the source I'm cloning is.
But then again I have pretty good network connectivity at work
and at least cable modem service at home...  most things clone down
pretty fast.

Its a quick hack to give a size upper bound.  I don't think its
that ugly.  Our network protocol is uglier with all of its hidden
fields jammed behind that NUL in the first advertisement line.
But I digress.

The better feature is probably resumable clone anyway.  At least
then people can abort a "long running" clone and have a good chance
they can pick it up again in the near future.  Its also not easy to
implement, which is why we've only been talking about it for years
and never actually seen a patch proposing to do it.

-- 
Shawn.

^ permalink raw reply

* Re: git-clone --how-much-disk-space-will-this-cost-me? [--depth n]
From: Nicolas Pitre @ 2008-12-17 16:46 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Jean-Luc Herren, jidanni, git
In-Reply-To: <20081217162127.GG32487@spearce.org>

On Wed, 17 Dec 2008, Shawn O. Pearce wrote:

> Nicolas Pitre <nico@cam.org> wrote:
> > 
> > And I consider any system doing such thing completely stupid.  Either 
> > you consistently know the information or you don't.  When you don't, it 
> > is best to not create expectations for the user.  And so far I think 
> > that 99.9% of git users are just fine with the progress display we 
> > currently provide.
> 
> Certainly true here; I never care how big the source I'm cloning is.
> But then again I have pretty good network connectivity at work
> and at least cable modem service at home...  most things clone down
> pretty fast.
> 
> Its a quick hack to give a size upper bound.  I don't think its
> that ugly.  Our network protocol is uglier with all of its hidden
> fields jammed behind that NUL in the first advertisement line.
> But I digress.

The ugliness in the protocol is encapsulated away from user view, and we 
could even seemlessly introduce a new protocol at any time with no 
issues if we wanted to.

This "quick hack" is imprecise, unreliable, and directly affect user 
perception.  This is way more dammageable as once users are used to it, 
good or bad, it won't be possible to get rid of it.

> The better feature is probably resumable clone anyway.  At least
> then people can abort a "long running" clone and have a good chance
> they can pick it up again in the near future.

Absolutely.

> Its also not easy to
> implement, which is why we've only been talking about it for years
> and never actually seen a patch proposing to do it.

A partial clone could possibly be turned into a shalow clone if at least 
the top commit is complete ...


Nicolas

^ permalink raw reply

* Re: git-clone --how-much-disk-space-will-this-cost-me? [--depth n]
From: Shawn O. Pearce @ 2008-12-17 16:48 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jean-Luc Herren, jidanni, git
In-Reply-To: <alpine.LFD.2.00.0812171136250.30035@xanadu.home>

Nicolas Pitre <nico@cam.org> wrote:
> > Its also not easy to
> > implement, which is why we've only been talking about it for years
> > and never actually seen a patch proposing to do it.
> 
> A partial clone could possibly be turned into a shalow clone if at least 
> the top commit is complete ...

But you of all people should know well that the top commit is also
a huge part of most clones.  Getting that top commit can be 30-60%
of the repository itself.  :-|

-- 
Shawn.

^ permalink raw reply

* Re: git-clone --how-much-disk-space-will-this-cost-me? [--depth n]
From: Nicolas Pitre @ 2008-12-17 16:56 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Jean-Luc Herren, jidanni, git
In-Reply-To: <20081217164841.GH32487@spearce.org>

On Wed, 17 Dec 2008, Shawn O. Pearce wrote:

> Nicolas Pitre <nico@cam.org> wrote:
> > > Its also not easy to
> > > implement, which is why we've only been talking about it for years
> > > and never actually seen a patch proposing to do it.
> > 
> > A partial clone could possibly be turned into a shalow clone if at least 
> > the top commit is complete ...
> 
> But you of all people should know well that the top commit is also
> a huge part of most clones.  Getting that top commit can be 30-60%
> of the repository itself.  :-|

Sure I know.  This is why I'm not pushing this solution really much.  ;)

I have ideas about how to solve this in a really nice way, but that 
implies pack V4.


Nicolas

^ permalink raw reply

* Re: Git Notes idea.
From: Govind Salinas @ 2008-12-17 17:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Johannes Schindelin
In-Reply-To: <20081217093843.GA18265@coredump.intra.peff.net>

On Wed, Dec 17, 2008 at 3:38 AM, Jeff King <peff@peff.net> wrote:
> On Tue, Dec 16, 2008 at 12:43:55PM -0600, Govind Salinas wrote:
>
>> I was thinking I would do my first implementation in pyrite and if I find
>> that it works well I will port it.
>
> OK, though your performance will probably suck unless you dump the notes
> tree into a local hash at the beginning of your program. And looking up
> every commit's note during revision traversal is one of the intended
> uses (e.g., decorating git-log output, or filtering commits based on a
> particular note).

Yes, I was thinking that this is the natural way to do things, save that I
would be lazy loading the trees into a cache instead of caching them
all up front.  This is one of the reasons that I think the fan out will
help.

> And as Dscho mentioned, most of what you need is already there in C.
> You are welcome to implement whatever you want in pyrite, of course, but
> there is a desire to have this accessible to the revision traversal
> machinery. And that means if you want your version in pyrite to be
> compatible with what ends up in git, the data structure design needs to
> be suitable for both.

Yes, I completely agree that I want it to have the same scheme as what
git will use.  That is the reason I posted this here.  Since no method
has been formally accepted (checked into master) I wanted to see if
I could nudge things along.  I wasn't aware that you and Dscho had
a (very similar) plan.  Please, if you guys are decided on the format
then I can just go off and start working on it.  But it sounds like there
isn't consensus yet.

<snip>
>> root/
>>      12/
>>          34567890123456789012345678901234567890/
>>              abcdef7890123456789012345678901234567890
>>
>> This way all the notes are attached to a tree, so that gc won't
>> think they are unreferenced objects.
>
> But you have lost the ordering in your list, then, since they will not
> be ordered by sha1 of the note contents. I don't know if you care. The
> second sha1 is pointless, anyway, since nobody will know that number as
> a reference; why not just name them monotonically starting at 1?

In a later mail I suggested that this be the type or name of the note.  Which
I hear is similar to what you suggested.

> One of the things I don't like about having several notes is that it
> introduces an extra level of indirection that every user has to pay for,
> whether they want it or not. If a note can be a blob _or_ a tree, then
> those who want to use blobs can reap the performance benefit. Those who
> want multiple named notes in a hierarchy can pay the extra indirection
> cost.
>
> I haven't measured how big a cost that is (but bearing in mind that we
> might want to do this lookup once per revision in a traversal, even one
> extra object lookup can have an impact).

That seems reasonable.

> I'm also still not convinced the fan-out is worthwhile, but I can see
> how it might be. It would be nice to see numbers for both.
>
<snip>
> Also, how large do you expect the list to be under reasonable
>> circumstances.
>
> As many notes as there are commits is my goal (e.g., it is not hard to
> imagine an automated process to add notes on build status). Ideally, we
> could handle as many notes as there are objects; I see no reason not to
> allow annotating arbitrary sha1's (I don't know if there is a use for
> that, but the more scalable the implementation, the better).

Ah, that is in line with what I was thinking as well.

>> >  Some thoughts from me on naming issues:
>> >  http://article.gmane.org/gmane.comp.version-control.git/100402
>>
>> On naming.  I strongly support a ref/notes/sha1/sha1 approach.  If
>> having a type to the note is important, then perhaps the first line of
>> a note could be considered a type or a set of "tags".  This way you
>
> I don't think we are talking about the same thing. What I mean by naming
> is "here is a shorthand for referring to notes" that is not necessarily
> coupled with the implementation. That is, I would like to do something
> like:
>
>  git log --notes-filter="foo:bar == 1"
>
> and have that "foo:bar" as a shorthand on each commit for:
>
>  refs/notes/foo:$COMMIT/bar
>
> Without a left-hand side (e.g., "bar"), we get:
>
>  refs/notes/default:$COMMIT/bar
>
> Or without a right-hand side (e.g., "foo:"), we get:
>
>  refs/notes/foo:$COMMIT
>

I like the overall plan, but I would suggest that --notes[=default] and
--note-type=whatever would be a little friendlier and less error prone.

Thanks for helping me think through this.

-Govind

^ permalink raw reply

* Re: rebasing commits that have notes, was Re: Git Notes idea.
From: Stephan Beyer @ 2008-12-17 17:55 UTC (permalink / raw)
  To: Johan Herland; +Cc: git, Johannes Schindelin, Jeff King, Govind Salinas
In-Reply-To: <200812171015.45303.johan@herland.net>

Hi,

Johan Herland wrote:
> PS: What's the current status on git-sequencer?

Usable, but I still have a small todo list of things to add
or fix. (And the next days some more time again. I hope it's
enough. Sorry, I'm too lame.)

> It's probably the best place to invoke these hooks.

If you want to do stuff on top of git sequencer it's currently
the best to fetch
	git://repo.or.cz/git/sbeyer.git seq-builtin-dev
or just to wait. :-\

Regards,
  Stephan

-- 
Stephan Beyer <s-beyer@gmx.net>, PGP 0x6EDDD207FCC5040F

^ permalink raw reply

* [PATCH 0/5] Be careful about lstat()-vs-readlink()
From: Linus Torvalds @ 2008-12-17 18:42 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List

This series of five patches makes us a lot more careful about readlink(), 
and in particular it avoids the code that depends on doing an lstat(), and 
then using st_size as the length of the link. That's what POSIX says 
_should_ happen, but Ramon Tayag reported git failing on NTFS under Linux 
because st_size doesn't match readlink() return value.

It doesn't seem to be unknown elsewhere either, since coreutils (through 
gnulib's areadlink_with_size) also has a big compatibility layer around 
readlink().

This series has been 'tested' by running it with the appended totally 
hacky patch that fakes out the lstat() st_size values for symlinks, so it 
has actually had some testing. 

The complete series does:

Linus Torvalds (5):
  Add generic 'strbuf_readlink()' helper function
  Make 'ce_compare_link()' use the new 'strbuf_readlink()'
  Make 'index_path()' use 'strbuf_readlink()'
  Make 'diff_populate_filespec()' use the new 'strbuf_readlink()'
  Make 'prepare_temp_file()' ignore st_size for symlinks

 builtin-apply.c |    6 ++----
 diff.c          |   25 +++++++++++--------------
 read-cache.c    |   22 ++++++++--------------
 sha1_file.c     |   14 +++++---------
 strbuf.c        |   28 ++++++++++++++++++++++++++++
 strbuf.h        |    1 +
 6 files changed, 55 insertions(+), 41 deletions(-)

and the hacky test-patch (which is obviously _not_ meant to be applied) is 
as follows...

		Linus

---
diff --git a/cache.h b/cache.h
index 231c06d..2d85fca 100644
--- a/cache.h
+++ b/cache.h
@@ -943,4 +943,7 @@ void overlay_tree_on_cache(const char *tree_name, const char *prefix);
 char *alias_lookup(const char *alias);
 int split_cmdline(char *cmdline, const char ***argv);

+extern int gitlstat(const char *, struct stat *);
+#define lstat gitlstat
+
 #endif /* CACHE_H */
diff --git a/read-cache.c b/read-cache.c
index b1475ff..defbb20 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -15,6 +15,16 @@
 #include "revision.h"
 #include "blob.h"

+#undef lstat
+int gitlstat(const char *path, struct stat *buf)
+{
+	int retval = lstat(path, buf);
+	if (!retval && S_ISLNK(buf->st_mode))
+		buf->st_size = 1;
+	return retval;
+}
+#define lstat gitlstat
+
 /* Index extensions.
  *
  * The first letter should be 'A'..'Z' for extensions that are not

^ permalink raw reply related

* [PATCH 1/5] Add generic 'strbuf_readlink()' helper function
From: Linus Torvalds @ 2008-12-17 18:42 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List
In-Reply-To: <alpine.LFD.2.00.0812171034520.14014@localhost.localdomain>


From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 17 Dec 2008 09:36:40 -0800

It was already what 'git apply' did in read_old_data(), just export it
as a real function, and make it be more generic.

In particular, this handles the case of the lstat() st_size data not
matching the readlink() return value properly (which apparently happens
at least on NTFS under Linux).  But as a result of this you could also
use the new function without even knowing how big the link is going to
be, and it will allocate an appropriately sized buffer.

So we pass in the st_size of the link as just a hint, rather than a
fixed requirement.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 builtin-apply.c |    6 ++----
 strbuf.c        |   28 ++++++++++++++++++++++++++++
 strbuf.h        |    1 +
 3 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index 4c4d1e1..07244b0 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1559,10 +1559,8 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf)
 {
 	switch (st->st_mode & S_IFMT) {
 	case S_IFLNK:
-		strbuf_grow(buf, st->st_size);
-		if (readlink(path, buf->buf, st->st_size) != st->st_size)
-			return -1;
-		strbuf_setlen(buf, st->st_size);
+		if (strbuf_readlink(buf, path, st->st_size) < 0)
+			return error("unable to read symlink %s", path);
 		return 0;
 	case S_IFREG:
 		if (strbuf_read_file(buf, path, st->st_size) != st->st_size)
diff --git a/strbuf.c b/strbuf.c
index 13be67e..904a2b0 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -288,6 +288,34 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
 	return sb->len - oldlen;
 }
 
+#define STRBUF_MAXLINK (2*PATH_MAX)
+
+int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
+{
+	if (hint < 32)
+		hint = 32;
+
+	while (hint < STRBUF_MAXLINK) {
+		int len;
+
+		strbuf_grow(sb, hint);
+		len = readlink(path, sb->buf, hint);
+		if (len < 0) {
+			if (errno != ERANGE)
+				break;
+		} else if (len < hint) {
+			strbuf_setlen(sb, len);
+			return 0;
+		}
+
+		/* .. the buffer was too small - try again */
+		hint *= 2;
+		continue;
+	}
+	strbuf_release(sb);
+	return -1;
+}
+
 int strbuf_getline(struct strbuf *sb, FILE *fp, int term)
 {
 	int ch;
diff --git a/strbuf.h b/strbuf.h
index b1670d9..89bd36e 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -124,6 +124,7 @@ extern size_t strbuf_fread(struct strbuf *, size_t, FILE *);
 /* XXX: if read fails, any partial read is undone */
 extern ssize_t strbuf_read(struct strbuf *, int fd, size_t hint);
 extern int strbuf_read_file(struct strbuf *sb, const char *path, size_t hint);
+extern int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint);
 
 extern int strbuf_getline(struct strbuf *, FILE *, int);
 
-- 
1.6.1.rc3.3.gcc3e3

^ permalink raw reply related

* [PATCH 2/5] Make 'ce_compare_link()' use the new 'strbuf_readlink()'
From: Linus Torvalds @ 2008-12-17 18:43 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List
In-Reply-To: <alpine.LFD.2.00.0812171042120.14014@localhost.localdomain>


From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 17 Dec 2008 09:47:27 -0800

This simplifies the code, and also makes ce_compare_link now able to
handle filesystems with odd 'st_size' return values for symlinks.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 read-cache.c |   22 ++++++++--------------
 1 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index c14b562..b1475ff 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -99,27 +99,21 @@ static int ce_compare_data(struct cache_entry *ce, struct stat *st)
 static int ce_compare_link(struct cache_entry *ce, size_t expected_size)
 {
 	int match = -1;
-	char *target;
 	void *buffer;
 	unsigned long size;
 	enum object_type type;
-	int len;
+	struct strbuf sb = STRBUF_INIT;
 
-	target = xmalloc(expected_size);
-	len = readlink(ce->name, target, expected_size);
-	if (len != expected_size) {
-		free(target);
+	if (strbuf_readlink(&sb, ce->name, expected_size))
 		return -1;
-	}
+
 	buffer = read_sha1_file(ce->sha1, &type, &size);
-	if (!buffer) {
-		free(target);
-		return -1;
+	if (buffer) {
+		if (size == sb.len)
+			match = memcmp(buffer, sb.buf, size);
+		free(buffer);
 	}
-	if (size == expected_size)
-		match = memcmp(buffer, target, size);
-	free(buffer);
-	free(target);
+	strbuf_release(&sb);
 	return match;
 }
 
-- 
1.6.1.rc3.3.gcc3e3

^ permalink raw reply related

* [PATCH 3/5] Make 'index_path()' use 'strbuf_readlink()'
From: Linus Torvalds @ 2008-12-17 18:43 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List
In-Reply-To: <alpine.LFD.2.00.0812171042500.14014@localhost.localdomain>


From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 17 Dec 2008 09:51:53 -0800

This makes us able to properly index symlinks even on filesystems where
st_size doesn't match the true size of the link.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 sha1_file.c |   14 +++++---------
 1 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 0e021c5..52d1ead 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2523,8 +2523,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object,
 int index_path(unsigned char *sha1, const char *path, struct stat *st, int write_object)
 {
 	int fd;
-	char *target;
-	size_t len;
+	struct strbuf sb = STRBUF_INIT;
 
 	switch (st->st_mode & S_IFMT) {
 	case S_IFREG:
@@ -2537,20 +2536,17 @@ int index_path(unsigned char *sha1, const char *path, struct stat *st, int write
 				     path);
 		break;
 	case S_IFLNK:
-		len = xsize_t(st->st_size);
-		target = xmalloc(len + 1);
-		if (readlink(path, target, len + 1) != st->st_size) {
+		if (strbuf_readlink(&sb, path, st->st_size)) {
 			char *errstr = strerror(errno);
-			free(target);
 			return error("readlink(\"%s\"): %s", path,
 			             errstr);
 		}
 		if (!write_object)
-			hash_sha1_file(target, len, blob_type, sha1);
-		else if (write_sha1_file(target, len, blob_type, sha1))
+			hash_sha1_file(sb.buf, sb.len, blob_type, sha1);
+		else if (write_sha1_file(sb.buf, sb.len, blob_type, sha1))
 			return error("%s: failed to insert into database",
 				     path);
-		free(target);
+		strbuf_release(&sb);
 		break;
 	case S_IFDIR:
 		return resolve_gitlink_ref(path, "HEAD", sha1);
-- 
1.6.1.rc3.3.gcc3e3

^ permalink raw reply related

* [PATCH 4/5] Make 'diff_populate_filespec()' use the new 'strbuf_readlink()'
From: Linus Torvalds @ 2008-12-17 18:44 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List
In-Reply-To: <alpine.LFD.2.00.0812171043180.14014@localhost.localdomain>


From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 17 Dec 2008 10:26:13 -0800

This makes all tests pass on a system where 'lstat()' has been hacked to
return bogus data in st_size for symlinks.

Of course, the test coverage isn't complete, but it's a good baseline.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 diff.c |   16 +++++++---------
 1 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/diff.c b/diff.c
index afefe08..4b2029c 100644
--- a/diff.c
+++ b/diff.c
@@ -1773,19 +1773,17 @@ int diff_populate_filespec(struct diff_filespec *s, int size_only)
 		s->size = xsize_t(st.st_size);
 		if (!s->size)
 			goto empty;
-		if (size_only)
-			return 0;
 		if (S_ISLNK(st.st_mode)) {
-			int ret;
-			s->data = xmalloc(s->size);
-			s->should_free = 1;
-			ret = readlink(s->path, s->data, s->size);
-			if (ret < 0) {
-				free(s->data);
+			struct strbuf sb = STRBUF_INIT;
+
+			if (strbuf_readlink(&sb, s->path, s->size))
 				goto err_empty;
-			}
+			s->data = strbuf_detach(&sb, &s->size);
+			s->should_free = 1;
 			return 0;
 		}
+		if (size_only)
+			return 0;
 		fd = open(s->path, O_RDONLY);
 		if (fd < 0)
 			goto err_empty;
-- 
1.6.1.rc3.3.gcc3e3

^ permalink raw reply related

* [PATCH 5/5] Make 'prepare_temp_file()' ignore st_size for symlinks
From: Linus Torvalds @ 2008-12-17 18:45 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List
In-Reply-To: <alpine.LFD.2.00.0812171043440.14014@localhost.localdomain>


From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 17 Dec 2008 10:31:36 -0800

The code was already set up to not really need it, so this just massages
it a bit to remove the use entirely.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

There's a few raw 'readlink()' calls left, but they all seem to just have 
a big buffer and rely on the return value of readlink() rather than look 
too closely at st_size. But it would probably be good if somebody 
double-checked it all. 

Even so, this should all be better than what we used to have, though.

 diff.c |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/diff.c b/diff.c
index 4b2029c..f160c1a 100644
--- a/diff.c
+++ b/diff.c
@@ -1881,13 +1881,12 @@ static void prepare_temp_file(const char *name,
 		if (S_ISLNK(st.st_mode)) {
 			int ret;
 			char buf[PATH_MAX + 1]; /* ought to be SYMLINK_MAX */
-			size_t sz = xsize_t(st.st_size);
-			if (sizeof(buf) <= st.st_size)
-				die("symlink too long: %s", name);
-			ret = readlink(name, buf, sz);
+			ret = readlink(name, buf, sizeof(buf));
 			if (ret < 0)
 				die("readlink(%s)", name);
-			prep_temp_blob(temp, buf, sz,
+			if (ret == sizeof(buf))
+				die("symlink too long: %s", name);
+			prep_temp_blob(temp, buf, ret,
 				       (one->sha1_valid ?
 					one->sha1 : null_sha1),
 				       (one->sha1_valid ?
-- 
1.6.1.rc3.3.gcc3e3

^ permalink raw reply related

* Announcement: Git Extensions stable (windows shell extensions)
From: Henk @ 2008-12-17 19:06 UTC (permalink / raw)
  To: git

This is a shameless announcement of my latest personal project; Git
Extensions. Git Extensions is a Tortoise-like windows shell extension for
git. Yesterday I finished version 0.9, the first stable release. I included
about all git commands I know about, so I think it is pretty complete but
I'm open to suggestions. 

It is written mostly in C#, except for shell extension part which is written
in C++. The project is open source, the sources can be found on GitHub. In
case there is someone interrested in the sources, be warned; the sources are
not very well documented yet and the solution is a still bit messy, I will
clean this up very soon. 

Main features
- Shell extensions
- Visual studio plugin
- Seperate git application

Features:
- Browse repository (incl. visual graph)
- Add files
- Apply patch
- Checkout branch/revision
- Cherry pick
- Create branch/tag
- Delete branch/tag
- Clone
- Commit
- Create (format) patch
- Init new repository
- Merge branches
- Pull
- Push
- Run mergetool
- Stash
- View differences

Information about the project and a installer package can be found here:
http://sourceforge.net/projects/gitextensions/
The installation requires msysgit to be installed AND git.exe to in the
system path.

ps.
I know there is a TortoiseGit project also, I just didn't know about that at
the time I started. If I knew about TortoiseGit, I probably never started
writing my own tool.
-- 
View this message in context: http://n2.nabble.com/Announcement%3A-Git-Extensions-stable-%28windows-shell-extensions%29-tp1669264p1669264.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply

* Re: Git Notes idea.
From: Junio C Hamano @ 2008-12-17 19:20 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jeff King, Govind Salinas, Git Mailing List
In-Reply-To: <alpine.DEB.1.00.0812171233270.28560@intel-tinevez-2-302>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> ...  But the main point 
> still stands: you go from commit to note, not from note to commit.  And 
> this is in stark contrast to tags, where you go from tag to commit, _not_ 
> from commit to tag.
>
> That is a fundamental _difference_ between tags and notes, so that I 
> refuse to accept the notion of notes being a generalized form of tags.

Hmm, how would you explain things like "git describe" (and "name-rev")?

^ permalink raw reply

* Re: Announcement: Git Extensions stable (windows shell extensions)
From: Sverre Rabbelier @ 2008-12-17 19:23 UTC (permalink / raw)
  To: Henk; +Cc: git
In-Reply-To: <1229540813648-1669264.post@n2.nabble.com>

On Wed, Dec 17, 2008 at 20:06, Henk <henk_westhuis@hotmail.com> wrote:
> I know there is a TortoiseGit project also, I just didn't know about that at
> the time I started. If I knew about TortoiseGit, I probably never started
> writing my own tool.

It seems TortoiseGit started only recently and is not nearly as
feature-complete as Git Extensions, so I wouldn't feel too bad in that
regard ;).

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply

* Re: Can I forbid somebody to pull some branch or tag from my repo with git protocol?
From: Junio C Hamano @ 2008-12-17 19:36 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Emily Ren, git
In-Reply-To: <alpine.DEB.1.00.0812171322330.28560@intel-tinevez-2-302>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Wed, 17 Dec 2008, Emily Ren wrote:
>
>> I want some group can pull these branches or tags from my repo, while
>> other's can't, Need I maintain two repositories ?
>
> Either that (that would be the easy method, and also the proper one, since 
> people would not even know what you hide), but you could patch 
> upload-pack so that it runs a hook with the rev-list arguments in 
> do_rev_list() in upload-pack.c, and die() if the hook returns non-zero.

I do not think that would work very well as you expect.  Two branches can
be pointing at the same commit, and Emily may want to hide one but not the
other.  The time you obtain from "want" is too late.

If you were to extend upload-pack, the place to narrow would be the
initial "here are the refs and the objects they point at" announcement
that is done at the very beginning.  You would do something like the
pseudo patch attached at the end.

read_set_of_exposed_refs_from_hook() should return, depending on who the
user is (which is obviously not available if this connection is over the
anonymous git-daemon service, but local and usual ssh connection you could
do whoami, and on gitosis there would be some environment variable to
distinguish who you are that you can use), the set of refs that the user
is allowed to see.

diff --git i/upload-pack.c w/upload-pack.c
index e5adbc0..129aa1e 100644
--- i/upload-pack.c
+++ w/upload-pack.c
@@ -10,6 +10,10 @@
 #include "revision.h"
 #include "list-objects.h"
 #include "run-command.h"
+#include "string-list.h"
+
+static int use_ref_limiting;
+static struct string_list exposed_refs;

 static const char upload_pack_usage[] = "git-upload-pack [--strict] [--timeout=nn] <dir>";

@@ -574,8 +578,14 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 	static const char *capabilities = "multi_ack thin-pack side-band"
 		" side-band-64k ofs-delta shallow no-progress"
 		" include-tag";
-	struct object *o = parse_object(sha1);
+	struct object *o;
+
+	if (use_ref_limiting && !string_list_has_string(&exposed_refs, refname)) {
+		/* The downloader is not allowed to know the presense of this ref */
+		return 0;
+	}

+	o = parse_object(sha1);
 	if (!o)
 		die("git upload-pack: cannot find object %s:", sha1_to_hex(sha1));

@@ -600,6 +610,12 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 static void upload_pack(void)
 {
 	reset_timeout();
+
+	if ("limit exposed refs" hook is available) {
+		use_ref_limiting = 1;
+		read_set_of_exposed_refs_from_hook(&exposed_refs);
+	}
+
 	head_ref(send_ref, NULL);
 	for_each_ref(send_ref, NULL);
 	packet_flush(1);

^ permalink raw reply related

* Re: Announcement: Git Extensions stable (windows shell extensions)
From: tekkub @ 2008-12-17 19:51 UTC (permalink / raw)
  To: git
In-Reply-To: <1229540813648-1669264.post@n2.nabble.com>


Very nice, however I ran into the following error while trying to install:
http://www.flickr.com/photos/26681170@N03/3115836831/

Can't wait to try it out!
--tek


Henk wrote:
> 
> This is a shameless announcement of my latest personal project; Git
> Extensions. Git Extensions is a Tortoise-like windows shell extension for
> git. Yesterday I finished version 0.9, the first stable release. I
> included about all git commands I know about, so I think it is pretty
> complete but I'm open to suggestions. 
>  
> It is written mostly in C#, except for shell extension part which is
> written in C++. The project is open source, the sources can be found on
> GitHub. In case there is someone interrested in the sources, be warned;
> the sources are not very well documented yet and the solution is a still
> bit messy, I will clean this up very soon. 
>  
> Main features
> - Shell extensions
> - Visual studio plugin
> - Seperate git application
>  
> Features:
> - Browse repository (incl. visual graph)
> - Add files
> - Apply patch
> - Checkout branch/revision
> - Cherry pick
> - Create branch/tag
> - Delete branch/tag
> - Clone
> - Commit
> - Create (format) patch
> - Init new repository
> - Merge branches
> - Pull
> - Push
> - Run mergetool
> - Stash
> - View differences
>  
> Information about the project and a installer package can be found here:
> http://sourceforge.net/projects/gitextensions/
> The installation requires msysgit to be installed AND git.exe to in the
> system path.
> 
> ps.
> I know there is a TortoiseGit project also, I just didn't know about that
> at the time I started. If I knew about TortoiseGit, I probably never
> started writing my own tool.
> 

-- 
View this message in context: http://n2.nabble.com/Announcement%3A-Git-Extensions-stable-%28windows-shell-extensions%29-tp1669264p1669467.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply

* Re: [PATCH] Documentation: fix description for enabling hooks
From: Markus Heidelberg @ 2008-12-17 19:55 UTC (permalink / raw)
  To: Miklos Vajna; +Cc: Johannes Sixt, gitster, git
In-Reply-To: <20081217143627.GB5691@genesis.frugalware.org>

Miklos Vajna, 17.12.2008:
> On Wed, Dec 17, 2008 at 08:44:40AM +0100, Johannes Sixt <j.sixt@viscovery.net> wrote:
> > > This is true, but having the executable bit is necessary as well. I
> > > think it would be better to just append this requirement instead of
> > > replacing the old one with this.
> > 
> > Markus's proposed new wording is correct because the .sample hooks *are*
> > already executable.
> 
> I thought about the following situation: The user reads the
> documentation while working in an older repo (initialized a few versions
> ago). S/he sees that the .sample suffix is already missing, so s/he
> assumes that the hook is already active. Which is not true, because the
> +x bit is missing.

Valid point, I think, but not critical in this case, since the patch
only affected gitrepository-layout(5) and gitglossary(7).

When you want to learn how to use hooks, you will probably rather read
githooks(5), where the need for the executable bit is not even
explicitly mentioned. Maybe it should be added there?

Markus

^ permalink raw reply

* Is it possible to roll back unstaged changes while leaving the staged ones for the next commit?
From: Tim Visher @ 2008-12-17 19:57 UTC (permalink / raw)
  To: git

Hello Everyone,

I'm attempting to use `git add -i` to help me resolve the differences
between two files in a way that makes sense (leaving class names, some
logic, but changing other things like white space, etc.).  I have all
of the changes that I want to be committed staged now in the index.
My question is this: is it possible to revert the changes that I
haven't staged so that the file looks as if everything inside of it
has been staged because none of the stuff I didn't stage is there
anymore.

Thanks in advance!

-- 

In Christ,

Timmy V.

http://burningones.com/
http://five.sentenc.es/ - Spend less time on e-mail

^ permalink raw reply

* Re: How to maintain private/secret/confidential branch.
From: Łukasz Lew @ 2008-12-17 19:57 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Alexander Potashev, Nick Andrew, git
In-Reply-To: <alpine.LNX.1.00.0812151501570.19665@iabervon.org>

Well, I am still a beginner in git. I just switched from mercurial.
Some inline follows:

2008/12/15 Daniel Barkalow <barkalow@iabervon.org>:
> On Sun, 14 Dec 2008, Łukasz Lew wrote:
>
>> Hi Alexander,
>>
>> On Sun, Dec 14, 2008 at 17:06, Alexander Potashev <aspotashev@gmail.com> wrote:
>> > Hello, Łukasz!
>> >
>> > On 16:38 Sun 14 Dec     , Łukasz Lew wrote:
>> >> Thanks Nick, thats really helpful (and surprisingly simple).
>> >> I have a couple more questions:
>> >>
>> >> On Sun, Dec 14, 2008 at 15:55, Nick Andrew <nick@nick-andrew.net> wrote:
>> >> > On Sun, Dec 14, 2008 at 02:49:50PM +0100, Łukasz Lew wrote:
>> >> >> I don't know how to make such a scenario work:
>> >> >> - two repositories: pub, priv
>> >> >> - priv is clone/branch of pub
>> >> >> - there is some constant developement both in pub and priv
>> >> >> - there are regular syncs with pub in priv
>> >> >>
>> >> >> Problem:
>> >> >> Occasionally I want to push some changes from priv to pub.
>> >> >> Then after syncing with pub I want to get as few conflicts as possible.
>> >> >>
>> >> >> Is it possible to do with git?
>> >> >
>> >> > Git can do almost anything. One should instead ask "How to do this
>> >> > with git?" :-)
>> >>
>> >> So I've heard, but not yet experienced it myself. I'm thrilled to try.
>> >>
>> >> >
>> >> > If I understand your problem, you could solve it with git cherry-pick
>> >> > and rebase. On priv, make a for-public branch from a pub branch. Then
>> >> > cherry-pick the commits you want from your private branch into the
>> >> > for-public branch.
>> >>
>> >> That almost works. Can I somehow split existing commits just like in git-add -p?
>> > It's, however, better to make more commits to not experience the need of
>> > commit splitting.
>>
>> Indeed good advice and best practice, but another best practice is to
>> not commit not compiling state.
>
> In your private branches, it's actually good practice to commit all sorts
> of junk. That way, when you mess up badly while trying to get it to
> compile, you won't have lost your work. Of course, that means your commits
> are going to need more cleanup before going public.

I started to follow your advise.
Then I rebase -i.
I found out I need more precise commit messages. :)

>
>> My common scenario is that I code a big change in priv repository, and
>> after that I find that some of its parts can and should be moved to
>> pub.
>
> I usually end up with my private branch containing the public branch, plus
> a bunch of commits that introduce: bugs, later fixed; mixed improvements;
> and debugging cruft. I want to generate nice commits that are individual
> improvements. I generally do:
> $ git checkout -b submit origin/master (the first time, to set it up)
>
> $ git checkout submit
> $ git diff submit mixed-work
> look at it for good changes, find some in file1 and file2
> $ git diff submit mixed-work -- file1 file2 | git apply

But with this command we do not preserve objects identity.
I.e: when you merge with mixed-work you have duplicate changes.
Is it ok?

> Sometimes, clean up bits that aren't ideal
> $ git add -i
> Add the good parts
> $ git checkout . (revert the working tree to the index)
> $ make test (did I extract the change correctly?)
> $ git commit
> Write a good message, sign off, etc
> $ git checkout mixed-work
> $ git rebase -i submit

... Ah I see, we throw away old commits anyway with rebasing.

> Often, resolve easy conflicts where my mixed-work branch introduced bugs
> that I fixed later and have now adopted the fixed code
>
> Then I repeat until I don't have any more good changes in mixed-work
> (either I have nothing, only debugging cruft, or only stuff I haven't
> gotten to work yet). If there's nothing but cruft, I've fully merged the
> topic, and I delete the branch.
>
> Eventually, I'm satisfied with what I've cleaned up, and I do:
> $ git push origin submit:master
>
> Also, I generally have a bunch of "mixed-work" branches, each containing
> different stuff that isn't ready. I'll periodicly go through all of them
> and rebase onto "submit" or "origin/master" (or, sometimes, give up on
> them and delete them).
>
> (One thing that would be nice to have is a "git apply --interactive" which
> applies the user's choice of hunks, like "git add -i" adds them)

I totally agree.

I would appriciate rebase --copy option, which doesn't move, but copy
the changelists like cherry-pick.
Then we could use rebase -i (with edit) instead of apply.

PS
Why after edit in rebase -i the change is already commited? I always
have to reset;add -i

>
>        -Daniel
> *This .sig left intentionally blank*

^ permalink raw reply

* Re: Announcement: Git Extensions stable (windows shell extensions)
From: Tim Visher @ 2008-12-17 20:04 UTC (permalink / raw)
  To: Henk; +Cc: git
In-Reply-To: <1229540813648-1669264.post@n2.nabble.com>

On Wed, Dec 17, 2008 at 2:06 PM, Henk <henk_westhuis@hotmail.com> wrote:
>
> This is a shameless announcement of my latest personal project; Git
> Extensions. Git Extensions is a Tortoise-like windows shell extension for
> git. Yesterday I finished version 0.9, the first stable release. I included
> about all git commands I know about, so I think it is pretty complete but
> I'm open to suggestions.

Nice!  With all of this TortoiseGit-like activity we're bound to have
something pretty usable pretty quickly! :)  I'll have to check out the
sources soon and see if I can get involved.  Maybe any of the
developers who were planning on working on TortoiseGit could
officially move over to Git Extensions since it seems to be much
further along?

-- 

In Christ,

Timmy V.

http://burningones.com/
http://five.sentenc.es/ - Spend less time on e-mail

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox