Git development
 help / color / mirror / Atom feed
* Re: [git patches] libata updates, GPG signed (but see admin notes)
From: Linus Torvalds @ 2011-11-03  2:19 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Junio C Hamano, git, James Bottomley, Jeff Garzik, Andrew Morton,
	linux-ide, LKML
In-Reply-To: <CA+55aFwXu=+HdQ5nW11Ts5p-V=KgpxjyagKqB+Xv+qBOEEWXvQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1275 bytes --]

On Wed, Nov 2, 2011 at 6:45 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>   [torvalds@i5 linux]$ git fetch git://github.com/rustyrussell/linux.git  refs/tags/rusty@rustcorp.com.au-v3.1-8068-g5087a50

So this trivial patch removes one line of code, and makes this actually work.

However, it also makes us fail many tests that *test* that we peeled
what we fetched. However, I think the tests are wrong.

If the tag doesn't resolve into a commit, we happily output the SHA1
of the tag itself - and we say that it shouldn't be merged.

And it the tag *does* resolve into a commit, why would we output the
SHA1 of the commit? The tag should be peeled properly later when it
gets used, so peeling it here seems to be just a misfeature that makes
signed tags not work well.

So I suspect we should just apply this patch, but I didn't check
exacty what the failed tests are - except for the first one, that just
compares against a canned response (and the canned response should
just be changed). Maybe there was some reason for the peeling,
although I suspect it was just a fairly mindless case of "make it a
commit, because the merge needs the commit" - never mind that the
merge would peel it anyway.

                           Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 550 bytes --]

 builtin/fetch.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 91731b909aeb..494a7f9976f8 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -436,8 +436,7 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 		}
 		note[note_len] = '\0';
 		fprintf(fp, "%s\t%s\t%s",
-			sha1_to_hex(commit ? commit->object.sha1 :
-				    rm->old_sha1),
+			sha1_to_hex(rm->old_sha1),
 			rm->merge ? "" : "not-for-merge",
 			note);
 		for (i = 0; i < url_len; ++i)

^ permalink raw reply related

* Re: [git patches] libata updates, GPG signed (but see admin notes)
From: Linus Torvalds @ 2011-11-03  2:25 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Junio C Hamano, git, James Bottomley, Jeff Garzik, Andrew Morton,
	linux-ide, LKML
In-Reply-To: <CAJo=hJsXvSyB65KBp8sfciT=h5uZSqSUdxkpWtZJRtr4hXAh5A@mail.gmail.com>

On Wed, Nov 2, 2011 at 7:14 PM, Shawn Pearce <spearce@spearce.org> wrote:
>
> So you propose we put the tag contents into the merge commit message
> so it can be verified after the fact? So merges are now going to be
> something much more horrific to read, because it will end with Git
> object tag cruft, the tag message, and the PGP signature spew that no
> human can decode in the head?

Actually, I wanted to just drop the damn thing.

To me, the point of the tag is so that the person doing the merge can
verify that he merges something trusted.

However, everybody else seems to disagree, and wants that stupid
signature to live along in the repository. And I can live with that,
although I do agree with you that it's not exactly pretty. I can live
with "ugly signature that I don't care for" way more than "stupid
design".

Because unlike your crazy empty commit, it at least fits the workflow,
and it certainly isn't any uglier that extraneous pointless commit.

You can disagree. You obviously do. I simply don't care. Because I'm right.

(And your claim that it's big UI fixes and protocol changes is pure
and utter garbage. I just sent a patch that cleans the code up,
removes a line that improperly drops information and gets rid of the
biggest problem with our current handling of tags. No protocol changes
involved, no big UI fixup).

                        Linus

^ permalink raw reply

* Re: [git patches] libata updates, GPG signed (but see admin notes)
From: Linus Torvalds @ 2011-11-03  2:31 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Junio C Hamano, git, James Bottomley, Jeff Garzik, Andrew Morton,
	linux-ide, LKML
In-Reply-To: <CAJo=hJsXvSyB65KBp8sfciT=h5uZSqSUdxkpWtZJRtr4hXAh5A@mail.gmail.com>

On Wed, Nov 2, 2011 at 7:14 PM, Shawn Pearce <spearce@spearce.org> wrote:
>
> <rant>

I'm answering this separately, because it's a separate rant.

It's also totally bogus, but whatever.

> Get over it. Add the fucking empty commit to show the flow of a
> change. Stop forcing every fucking contributor to rebase/rewrite his
> commits just so someone higher up in the food chain can wank with
> their SOB line.

Shawn, stop using whatever drugs you are using.

NOBODY EVER REBASES ANYTHING FOR SIGNED-OFF-BY.

If they do, they are doing things very very wrong.

Signed-off-by: is *purely* for sending patches by email. No git
operations involved. None. Nada. Zilch. No rebasing involved, because
there's not even a git repository involved, for chissake!

Once something is in git, it's not signed off on - there should be a
sign-off-chain from the author to the committer, and that's it.
Anything else would be crazy.

So stop the crazy rants. Stop with the bad drugs. Seriously. You're
acting crazy.

                          Linus

^ permalink raw reply

* Re: New Feature wanted: Is it possible to let git clone continue last break point?
From: Jeff King @ 2011-11-03  2:42 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Junio C Hamano, Jonathan Nieder, netroby, Git Mail List,
	Tomas Carnecky
In-Reply-To: <CAJo=hJtt8vjB5oU+tEabN2AS7c-24bMHNwQSoWtZYtjjrR3d7Q@mail.gmail.com>

On Wed, Nov 02, 2011 at 05:06:53PM -0700, Shawn O. Pearce wrote:

> Yup, I agree. The "repo" tool used by Android does this in Python
> right now[1].  Its a simple hack, if the protocol is HTTP or HTTPS the
> client first tries to download $URL/clone.bundle. My servers have
> rules that trap on */clone.bundle and issue an HTTP 302 Found response
> to direct the client to a CDN. Works. :-)

I thought of doing something like that, but I wanted to be able to make
cross-domain links. The "302 to a CDN" thing is a clever hack, but it
requires more control of the webserver than some users might have. And
of course it doesn't work for the "redirect to git:// on a different
server" trick. Or redirect from "git://".

My thought of having it in "refs/mirrors" is only slightly less hacky,
but I think covers all of those cases. :)

> > Even if the bundle thing ends up too wasteful, it may still be useful to
> > offer a "if you don't have X, go see Y" type of mirror when "Y" is
> > something efficient, like git:// at a faster host (i.e., the "I built 3
> > commits on top of Linus" case).
> 
> Actually, I really think the bundle thing is wasteful. Its a ton of
> additional disk. Hosts like kernel.org want to use sendfile() when
> possible to handle bulk transfers. git:// is not efficient for them
> because we don't have sendfile() capability.

I didn't quite parse this. You say it is wasteful, but then indicate
that it can use sendfile(), which is a good thing.

However, I do agree with this:

> Its also expensive for kernel.org to create each Git repository twice
> on disk. The disk is cheap. Its the kernel buffer cache that is damned
> expensive. Assume for a minute that Linus' kernel repository is a
> popular thing to access. If 400M of that history is available in a
> normal pack file on disk, and again 400M is available as a "clone
> bundle thingy", kernel.org now has to eat 800M of disk buffer cache
> for that one Git repository, because both of those files are going to
> be hot.

Doubling the disk cache required is evil and ugly. I was hoping it
wouldn't matter because the bundle would be hosted on some far-away CDN
server anyway, though. But that is highly dependent on your setup. And
it's really just glossing over the fact that you have twice as many
servers. ;)

> I think I messed up with "repo" using a Git bundle file as its data
> source. What we should have done was a bog standard pack file. Then
> the client can download the pack file into the .git/objects/pack
> directory and just generate the index, reusing the entire dumb
> protocol transport logic. It also allows the server to pass out the
> same file the server retains for the repository itself, and thus makes
> the disk buffer cache only 400M for Linus' repository.

That would be cool, but what about ref tips? The pack is just a big blob
of objects, but we need ref tips to advertise to the server when we come
back via the smart protocol. We can make a guess about them, obviously,
but it would be nice to communicate them. I guess the mirror data could
include the tips and a pointer to a pack file.

Another issue with packs is that they generally aren't supposed to be
--thin on disk, whereas bundles can be. So I could point you to a
succession of bundles. Which is maybe a feature, or maybe just makes
things insanely complex[1].

> One (maybe dumb idea I had) was making the $GIT_DIR/objects/info/packs
> file contain other lines to list reference tips at the time the pack
> was made.

So yeah, that's another solution to the ref tip thingy, and that would
work. I don't think it would make a big difference whether the tips were
in the "mirror" file, or alongside the packfile. The latter I guess
might make administration easier. The "real" repo points its mirror one
time to a static pack store, and then the client goes and grabs whatever
it can from that store.

> Then we wind up with a git:// or ssh:// protocol extension that
> enables sendfile() on an entire pack, and to provide the matching
> objects/info/packs data to help a client over git:// or ssh://
> initialize off the existing pack files.

I think we can get around this by pointing git:// clients, either via
protocol extension or via a magic ref, to an http pack store. Sure, it's
an extra TCP connection, but that's not a big deal compared to doing an
initial clone of most repos.

So the sendfile() stuff would always happen over http.

> But either way, I like the idea of coupling the "resumable pack
> download" to the *existing* pack files, because this is easy to deal
> with.

Yeah, I'm liking that idea. In reference to my [1] above, what I've
started with is making:

  git fetch http://host/foo.bundle

work automatically. And it does work. But it actually spools the bundle
to disk and then unpacks from it, rather than placing it right into the
objects/pack directory. I did this because:

  1. We have to feed it to "index-pack --fix-thin", because bundles can
     be thin. So they're not suitable for sticking right into the pack
     directory.

  2. We could feed it straight to an index-pack pipe, but then we don't
     have a byte-for-byte file on disk to resume an interrupted
     transfer.

But spooling sucks, of course. It means we use twice as much disk space
during the index-pack as we would otherwise need to, not to mention the
latency of not starting the index-pack until we get the whole file.

Pulling down a non-thin packfile makes the problem go away. We can spool
it right into objects/pack, index it on the fly, and if all is well,
move it into its final filename. If the transfer is interrupted, you
drop what's been indexed so far, finish the transfer, and then re-start
the indexing from scratch (actually, the "on the fly" would probably
involve teaching index-pack to be clever about incrementally reading a
partially written file, but it should be possible).

-Peff

^ permalink raw reply

* Re: [git patches] libata updates, GPG signed (but see admin notes)
From: Jeff King @ 2011-11-03  2:55 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Linus Torvalds, Junio C Hamano, git, James Bottomley, Jeff Garzik,
	Andrew Morton, linux-ide, LKML
In-Reply-To: <CAJo=hJv5nAKH_ptYSWfMvFQv0Dj+naPXK35wSzKYkfPOYsWkxg@mail.gmail.com>

On Wed, Nov 02, 2011 at 06:02:37PM -0700, Shawn O. Pearce wrote:

> > So I really think that signing the top commit itself is fundamentally wrong.
> 
> I really disagree. I like the signed commit approach. It allows for a
> lot more workflows than just providing a way for you to validate a
> pull from a trusted lieutenant. Debian/Gentoo folks want a way to sign
> every commit in their workflow. Just because you don't want that and
> think its crazy doesn't mean its not a valid workflow for that
> community and is something Git shouldn't support. I never use `git
> stash`. I hate the damn command. Yet its still there. I just choose
> not to use it. Junio's gpgsig header on each commit is also optional,
> and communities/contributors can choose to use (or ignore) the feature
> as they need to.

Stop for a minute and think about what it _means_ to sign a commit. Is
it saying "I wrote this commit?" Or "I think this commit is good?" Or "I
think all of the history leading to this is good?" It's obviously going
to be a per-project thing, but it's very constricting.  Leaving aside
all of the workflow issues Linus brought up (but which I do agree with),
think about what it would mean for Linus to fetch a commit from a
lieutenant and then sign it. Whatever it means, it can really only be
_one_ thing.

But big projects that are interested in signatures probably want to say
more. They want to say "this developer really wrote this commit". They
want to say "QA passed this commit". They want to say "the history up to
here looks good". And so on.

But they can't say those things without binding some data to the commit
(i.e., making a certificate saying "this commit passed QA").  Data which
might only make sense to assert much later than the commit is written.

So you're going to need to support detached commit signatures in some
form anyway to make everybody happy. Which isn't to say in-commit
signatures are wrong, but they are just one tool in a toolbox.

Personally, I think the only thing that makes sense to assert inside a
commit itself is that you are the author, and the author line of the key
should match the email UID of the signing key. And then anything you
want to say about _other_ people's commits (or even your own commits,
but later) should come in the form of detached signatures with some
content.

That's how signed tags work. It's not just Linus signing a commit. It's
Linus signing a binding between a commit and the statement "this is
v2.6.28". The only thing wrong with the signed tag model for more
general use is that you need some way of naming and organizing large
numbers of tags (e.g., several per commit if you have things like QA
signatures).

-Peff

^ permalink raw reply

* Re: [git patches] libata updates, GPG signed (but see admin notes)
From: Robin H. Johnson @ 2011-11-03  3:16 UTC (permalink / raw)
  To: Jeff King, Git Mailing List
  Cc: Shawn Pearce, Linus Torvalds, Junio C Hamano, James Bottomley,
	Jeff Garzik, Andrew Morton, linux-ide, LKML
In-Reply-To: <20111103025532.GB9492@sigill.intra.peff.net>

On Wed, Nov 02, 2011 at 10:55:32PM -0400,  Jeff King wrote:
> But big projects that are interested in signatures probably want to say
> more. They want to say "this developer really wrote this commit". They
> want to say "QA passed this commit". They want to say "the history up to
> here looks good". And so on.
On the Gentoo side, we've also pondered the question of:
author != committer != pusher
And how to preserve many signatures from sources.

We're on a central repo model, with some ~250 committers.

I was originally primarily after the push certificates/signed-push, and
recording that data in the notes, but that still has the problems of
third-party verification as mentioned in the thread.

If we require that the tip of every push is a signed commit via a hook,
we get knowledge of the pushers. Either your real commit itself is
signed, or you have a signed merge commit on top, or you have a signed
empty commit. In all of the cases, I can verify your signature at the
recv hook. Having signed push in this case has a benefit that you could
ship the data as a bundle, or async from the signing.

The QA value of multiple signatures per commit is also valuable, to
assert SOB WITHOUT altering the commit. I see spearce's rant and the
retort, and really think there needs to be a middle ground - some of
commits that are coming from pulls, and not getting additional SOB,
could really benefit from them being recorded (I see them on mailing
lists, but not introduced since that would break 'stable' IDs).

> But they can't say those things without binding some data to the commit
> (i.e., making a certificate saying "this commit passed QA").  Data which
> might only make sense to assert much later than the commit is written.
> 
> So you're going to need to support detached commit signatures in some
> form anyway to make everybody happy. Which isn't to say in-commit
> signatures are wrong, but they are just one tool in a toolbox.
I was proposing that Git supports _all_ of these models:
- signed commits
- signed pushes (via certs)
- whatever signed lightweight tag idea happens
- existing annotated tags

Choices. Each with their own costs and advantages.

-- 
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail     : robbat2@gentoo.org
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85

^ permalink raw reply

* [PATCH] fsck: print progress
From: Nguyễn Thái Ngọc Duy @ 2011-11-03  3:21 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Nguyễn Thái Ngọc Duy
In-Reply-To: <CACsJy8DN74QWYA_NzBCGgp_VdQpV5PqGMgAFUKYbOecVqw6HYQ@mail.gmail.com>

fsck is usually a long process and it would be nice if it prints
progress from time to time.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-fsck.txt |   12 ++++++++-
 builtin/fsck.c             |   57 +++++++++++++++++++++++++++++++++++++++++--
 pack-check.c               |   11 ++++++--
 pack.h                     |    4 ++-
 4 files changed, 76 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index a2a508d..5245101 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -10,7 +10,8 @@ SYNOPSIS
 --------
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
-	 [--[no-]full] [--strict] [--verbose] [--lost-found] [<object>*]
+	 [--[no-]full] [--strict] [--verbose] [--lost-found]
+	 [--[no-]progress] [<object>*]
 
 DESCRIPTION
 -----------
@@ -72,6 +73,15 @@ index file, all SHA1 references in .git/refs/*, and all reflogs (unless
 	a blob, the contents are written into the file, rather than
 	its object name.
 
+--progress::
+--no-progress::
+	When fsck is run in a terminal, it will show the progress.
+	These options can force progress to be shown or not
+	regardless terminal check.
++
+Progress is not shown when --verbose is used. --progress is ignored
+in this case.
+
 It tests SHA1 and general object sanity, and it does full tracking of
 the resulting reachability and everything else. It prints out any
 corruption it finds (missing or bad objects), and if you use the
diff --git a/builtin/fsck.c b/builtin/fsck.c
index df1a88b..dc6a6fc 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -11,6 +11,7 @@
 #include "fsck.h"
 #include "parse-options.h"
 #include "dir.h"
+#include "progress.h"
 
 #define REACHABLE 0x0001
 #define SEEN      0x0002
@@ -27,6 +28,8 @@ static const char *head_points_at;
 static int errors_found;
 static int write_lost_and_found;
 static int verbose;
+static int show_progress = -1;
+
 #define ERROR_OBJECT 01
 #define ERROR_REACHABLE 02
 
@@ -512,15 +515,20 @@ static void get_default_heads(void)
 static void fsck_object_dir(const char *path)
 {
 	int i;
+	struct progress *progress;
 
 	if (verbose)
 		fprintf(stderr, "Checking object directory\n");
 
+	if (show_progress)
+		progress = start_progress("Checking object directories", 256);
 	for (i = 0; i < 256; i++) {
 		static char dir[4096];
 		sprintf(dir, "%s/%02x", path, i);
 		fsck_dir(i, dir);
+		display_progress(progress, i+1);
 	}
+	stop_progress(&progress);
 	fsck_sha1_list();
 }
 
@@ -591,6 +599,7 @@ static struct option fsck_opts[] = {
 	OPT_BOOLEAN(0, "strict", &check_strict, "enable more strict checking"),
 	OPT_BOOLEAN(0, "lost-found", &write_lost_and_found,
 				"write dangling objects in .git/lost-found"),
+	OPT_BOOL   (0, "progress", &show_progress, "show progress"),
 	OPT_END(),
 };
 
@@ -603,6 +612,12 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	read_replace_refs = 0;
 
 	argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0);
+
+	if (show_progress == -1)
+		show_progress = isatty(2);
+	if (verbose)
+		show_progress = 0;
+
 	if (write_lost_and_found) {
 		check_full = 1;
 		include_reflogs = 0;
@@ -622,20 +637,56 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	if (check_full) {
 		struct packed_git *p;
+		int i, nr_packs = 0;
+		struct progress *progress = NULL;
 
 		prepare_packed_git();
+
 		for (p = packed_git; p; p = p->next)
+			nr_packs++;
+
+		for (i = 1, p = packed_git; p; p = p->next, i++) {
+			if (show_progress) {
+				char buf[32];
+				snprintf(buf, sizeof(buf), "Verifying pack %d/%d",
+					 i, nr_packs);
+				if (open_pack_index(p))
+					continue;
+				progress = start_progress(buf, p->num_objects);
+			}
 			/* verify gives error messages itself */
-			verify_pack(p);
+			verify_pack(p, progress);
+
+			/*
+			 * we do not stop progress here, let the next
+			 * progress line overwrite the current one for
+			 * the next pack.
+			 */
+		}
+		stop_progress(&progress);
 
-		for (p = packed_git; p; p = p->next) {
+		for (i = 1, p = packed_git; p; p = p->next, i++) {
+			char buf[64];
 			uint32_t j, num;
 			if (open_pack_index(p))
 				continue;
 			num = p->num_objects;
-			for (j = 0; j < num; j++)
+
+			snprintf(buf, sizeof(buf), "Checking objects on pack %d/%d",
+				 i, nr_packs);
+			if (show_progress)
+				progress = start_progress(buf, num);
+			for (j = 0; j < num; j++) {
 				fsck_sha1(nth_packed_object_sha1(p, j));
+				display_progress(progress, j+1);
+			}
+			/*
+			 * we do not stop progress here, let the next
+			 * progress line overwrite the current one for
+			 * the next pack.
+			 */
 		}
+		stop_progress(&progress);
 	}
 
 	heads = 0;
diff --git a/pack-check.c b/pack-check.c
index 0c19b6e..e30c18c 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "pack.h"
 #include "pack-revindex.h"
+#include "progress.h"
 
 struct idx_entry {
 	off_t                offset;
@@ -42,7 +43,8 @@ int check_pack_crc(struct packed_git *p, struct pack_window **w_curs,
 }
 
 static int verify_packfile(struct packed_git *p,
-		struct pack_window **w_curs)
+			   struct pack_window **w_curs,
+			   struct progress *progress)
 {
 	off_t index_size = p->index_size;
 	const unsigned char *index_base = p->index_data;
@@ -126,7 +128,10 @@ static int verify_packfile(struct packed_git *p,
 			break;
 		}
 		free(data);
+		if ((i & 1023) == 0)
+			display_progress(progress, i);
 	}
+	display_progress(progress, i);
 	free(entries);
 
 	return err;
@@ -155,7 +160,7 @@ int verify_pack_index(struct packed_git *p)
 	return err;
 }
 
-int verify_pack(struct packed_git *p)
+int verify_pack(struct packed_git *p, struct progress *progress)
 {
 	int err = 0;
 	struct pack_window *w_curs = NULL;
@@ -164,7 +169,7 @@ int verify_pack(struct packed_git *p)
 	if (!p->index_data)
 		return -1;
 
-	err |= verify_packfile(p, &w_curs);
+	err |= verify_packfile(p, &w_curs, progress);
 	unuse_pack(&w_curs);
 
 	return err;
diff --git a/pack.h b/pack.h
index 722a54e..572794f 100644
--- a/pack.h
+++ b/pack.h
@@ -70,10 +70,12 @@ struct pack_idx_entry {
 	off_t offset;
 };
 
+struct progress;
+
 extern const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects, int nr_objects, const struct pack_idx_option *, unsigned char *sha1);
 extern int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, off_t len, unsigned int nr);
 extern int verify_pack_index(struct packed_git *);
-extern int verify_pack(struct packed_git *);
+extern int verify_pack(struct packed_git *, struct progress *);
 extern void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t);
 extern char *index_pack_lockfile(int fd);
 extern int encode_in_pack_object_header(enum object_type, uintmax_t, unsigned char *);
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* Re: [git patches] libata updates, GPG signed (but see admin notes)
From: Jochen Striepe @ 2011-11-03  3:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Shawn Pearce, Junio C Hamano, git, James Bottomley, Jeff Garzik,
	Andrew Morton, linux-ide, LKML
In-Reply-To: <CA+55aFyXg32mko8TOGCfGHpr3jHBEgcKiK7HdVwq0Wez0fAs9A@mail.gmail.com>

	Hi,

On Wed, Nov 02, 2011 at 07:25:17PM -0700, Linus Torvalds wrote:
> To me, the point of the tag is so that the person doing the merge can
> verify that he merges something trusted.
> 
> However, everybody else seems to disagree, and wants that stupid
> signature to live along in the repository.

It seems quite useless and leading to false conclusions in several cases
where the merger's gpg output differs from someone's checking later on,
e.g. when

 - the signing key has been revoked in the mean time (for whatever
   reasons)
 - the signing key has expired
 - the public part of the signing key is not available for the general
   public.

AFAIK gpg just gives you an error code and a message like e.g. "Key has
expired" without stating if the key was valid _when signing the commit_.

How do you plan to handle this when keeping the signature in the
repository? Or am I overlooking something?


Thanks,
Jochen.

^ permalink raw reply

* Re: [PATCH] fsck: print progress
From: Jeff King @ 2011-11-03  3:33 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git
In-Reply-To: <1320290513-8878-1-git-send-email-pclouds@gmail.com>

On Thu, Nov 03, 2011 at 10:21:53AM +0700, Nguyen Thai Ngoc Duy wrote:

> fsck is usually a long process and it would be nice if it prints
> progress from time to time.

The output looks good to me. Code looks sane overall, with one comment:

> +		for (i = 1, p = packed_git; p; p = p->next, i++) {
> +			if (show_progress) {
> +				char buf[32];
> +				snprintf(buf, sizeof(buf), "Verifying pack %d/%d",
> +					 i, nr_packs);
> +				if (open_pack_index(p))
> +					continue;
> +				progress = start_progress(buf, p->num_objects);
> +			}
>  			/* verify gives error messages itself */
> -			verify_pack(p);
> +			verify_pack(p, progress);
> +
> +			/*
> +			 * we do not stop progress here, let the next
> +			 * progress line overwrite the current one for
> +			 * the next pack.
> +			 */
> +		}
> +		stop_progress(&progress);

We're actually leaking some memory here, since stop_progress will also
free() the progress object and any associated resources. It's not a lot,
but it's kind of ugly.

Perhaps there should be a special version of stop_progress that handles
this better? Or perhaps we could even come up with a total object count
before starting.  I guess it would involve mapping each pack index
simultaneously, though by my reading of the code, I think we do that
anyway (the opened index is cached in the pack object).

-Peff

^ permalink raw reply

* Re: [git patches] libata updates, GPG signed (but see admin notes)
From: Linus Torvalds @ 2011-11-03  4:13 UTC (permalink / raw)
  To: Jochen Striepe
  Cc: Shawn Pearce, Junio C Hamano, git, James Bottomley, Jeff Garzik,
	Andrew Morton, linux-ide, LKML
In-Reply-To: <20111103032205.GA25888@pompeji.miese-zwerge.org>

On Wed, Nov 2, 2011 at 8:22 PM, Jochen Striepe <jochen@tolot.escape.de> wrote:
>
> It seems quite useless and leading to false conclusions in several cases
> where the merger's gpg output differs from someone's checking later on,
> e.g. when
>
>  - the signing key has been revoked in the mean time (for whatever
>   reasons)
>  - the signing key has expired
>  - the public part of the signing key is not available for the general
>   public.

So I don't think those are *big* issues. Sure, you'd want the public
key to be public for it to make any real sense to save, but on the
other hand, they *are* generally public. Yes, yes, you might have keys
that are only used - and only made public - within some particular
organization, but in that case the source code that gets signed with
those keys would tend to be private to that organization too, so..

And yes, keys get revoked or they expire, but that's still a pretty
rare event, so it doesn't really invalidate the argument that making
the original signed content available can quite often be useful - even
if it's not guaranteed to *always* be useful.

No, my main objection to saving the data is that it's ugly and it's
redundant. Sure, in practice you can check the signatures later fine
(with the rare exceptions you mention), but even when you can do it,
what's the big upside?

And there are much bigger real downsides, imho.

For example, let's say that we do eventually end up switching from
SHA1 to SHA256 in git, and we do a full re-import of the tree. Guess
what? All those signatures are now just so much garbage. Sure, you can
recreate them (create some trusted script that you agree does a 1:1
transform, and re-sign everything), but in practice you can't ever
really do that - because all those things are tied to the tree, so you
need to have *everybodys* private keys in one place to do so. And the
people who signed things initially would have to be insane to allow
that.

So I'm actually of the opinion that "internal signatures" are bad
design at a rather fundamental level.

In contrast, the "external signed tags" are fine: it's not just that
there are much fewer of them, it's that they are *independent*. So you
can easily re-generate the signed tags, because each signer can
*individually* decide to validate the newly converted tree, and sign
off on the fact that the conversion was done identically using new
external tags with signatures.

This was one of the reasons I made the signed tags work the way they
do. And it wasn't because I was extremely far-sighted and thought of
all the problems that internal signatures have - it's because monotone
had their internal signatures, and every other email on the monotone
list was about all the problems it caused.

> AFAIK gpg just gives you an error code and a message like e.g. "Key has
> expired" without stating if the key was valid _when signing the commit_.
>
> How do you plan to handle this when keeping the signature in the
> repository? Or am I overlooking something?

So see above - I just wouldn't worry about it. The possible few cases
where it would occur are dwarfed by the cases where it *doesn't*
occur, and those are the ones I'd concentrate on. They are the ones
that need to be important enough that it's even worth carrying the
random noise around.

Are they?

So I do think that there are real upsides at the *process* level where
you can use the signatures to verify that what is pulled is pulled
from the person you thought it was. I don't think anybody disputes
those advantages. But outside of that I think it gets very gray, and
there real disadvantages.

That said, I don't care *that* much. I don't mind polluting the merge
commits with information that I don't think is really worth it. So I'd
be willing to carry the signature information around, although I'd
hope to minimize it and have some sane way to hide it.

            Linus

^ permalink raw reply

* Re: New Feature wanted: Is it possible to let git clone continue last break point?
From: Shawn Pearce @ 2011-11-03  4:19 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Jonathan Nieder, netroby, Git Mail List,
	Tomas Carnecky
In-Reply-To: <20111103024248.GA9492@sigill.intra.peff.net>

On Wed, Nov 2, 2011 at 19:42, Jeff King <peff@peff.net> wrote:
> On Wed, Nov 02, 2011 at 05:06:53PM -0700, Shawn O. Pearce wrote:
>
> I thought of doing something like that, but I wanted to be able to make
> cross-domain links. The "302 to a CDN" thing is a clever hack, but it
> requires more control of the webserver than some users might have. And
> of course it doesn't work for the "redirect to git:// on a different
> server" trick. Or redirect from "git://".

I agree. Later I said I regret this being a bundle file. I also regret
it being this $URL/clone.bundle thing. Its a reasonable quick hack in
Python for repo. Its cheap for my servers to respond 404 Not Found or
302 Found, and cheap to use the CDN. But it isn't the right solution
for git-core.

It has given us some useful information already in the context of
android.googlesource.com. It appears to work quite well for
distributing the large Android operating system. So the notion of
making packs available from another URL than the main repository, and
doing it as primarily a pack and not the native Git protocol, with a
follow-up incremental fetch to bring the client current seems to work.
 :-)

> My thought of having it in "refs/mirrors" is only slightly less hacky,
> but I think covers all of those cases. :)

Right, but this would have been a bit more work for me to code in Python. :-)

Long term this may be a better approach, because it does allow the
user to control the redirect without having full control over their
HTTP server. It also supports redirections across protocols like you
noted above. So its probably the direction we will see git-core take.

>> Actually, I really think the bundle thing is wasteful.... sendfile() capability.
>
> I didn't quite parse this. You say it is wasteful, but then indicate
> that it can use sendfile(), which is a good thing.

Apparently I was babbling. Based on what else you say, we agree. That
is good enough for me.

> However, I do agree with this:
>
>> Its also expensive for kernel.org to create each Git repository twice
>> on disk. The disk is cheap. Its the kernel buffer cache that is damned
>> expensive.
>
> Doubling the disk cache required is evil and ugly. I was hoping it
> wouldn't matter because the bundle would be hosted on some far-away CDN
> server anyway, though. But that is highly dependent on your setup. And
> it's really just glossing over the fact that you have twice as many
> servers. ;)

Right. :-)

In my opinion this is the important part. We shouldn't double the disk
usage required to support this. Most users can't afford the extra disk
cache or the extra server required to make this work well. But they
can use sendfile() on the server they have and get a lot of
improvement in clone speed due to lower system load, plus resumable
clone for the relatively stable history part.

> Another issue with packs is that they generally aren't supposed to be
> --thin on disk, whereas bundles can be. So I could point you to a
> succession of bundles. Which is maybe a feature, or maybe just makes
> things insanely complex[1].

Actually we can store --thin on disk safely.  Don't laugh until you
finish reading it through.

To build an incremental pack we modify pack-objects to construct a
completed thin pack on disk. Build up the list of objects that you
want in the thin pack, as though it were thin. Use REF_DELTA format to
reference objects that are not in this set but are delta bases. Copy
the necessary delta bases from the base pack over to the thin pack, at
the end just like it would be if received over the wire. The pack is
now self-contained like its supposed to be, but the tail of it is
redundant information.

If you cache alongside of the pack the "thin" object count, the cut
offset of the thin vs. completed bases, and the SHA-1 of the "thin"
pack, you can serve the "thin" pack by copying the header, then the
region of the file up to the cut point, and the final SHA-1. And there
are no pack file format changes involved.

:-)

Obviously this has some downside. Using REF_DELTA instead of OFS_DELTA
for the relatively small number of references from the "thin" part to
the completed part at the tail isn't a big disk space overhead. The
big overhead is storing the boundary data that served as delta bases
at the tail of this incremental pack. But we already do that when you
transfer this section of data over the network and it was more than
100 objects.

So I think we can get away with doing this. The serving repository is
in no worse state than if the owner had just pushed all of that
incremental stuff into the serving repository and it completed as a
thin pack. With only 2 packs in the serving repository (e.g. the
historical stuff that is stable, and the incremental current thin pack
+ completed bases), git gc --auto wouldn't even kick in to GC this
thing for a while *anyway*. So we already probably have a ton of
repositories in the wild that exhibit this disk layout and space
usage, and nobody has complained about it.

For a server admin or repository owner who cares about his user's
resumable clone support, carrying around a historical pack and a
single new incremental pack for say 2-3 months before repacking the
entire thing down to 1 new historical pack... the disk space and
additional completed base data is an acceptable cost. We already do
it.

Clients can figure out whether or not they should use an incremental
pack download vs the native Git protocol if the incremental pack does
like a bundle does and stores the base information alongside of it.
Actually you don't want the base (the ^ lines in a bundle), but the
immediate child of those. If the client has any of those children,
there is some chance the client has other objects in the pack and
should favor native protocol. But if the client has none of those base
children, but does have the base, it may be more efficient to download
the entire pack to bring the client current.

The problem with incremental pack updates is balancing the number of
round-trip requests against the update rate of the repository against
the polling frequency of the client. Its not an easy thing to solve.

However, we may be able to do better if the server can do a reasonably
fast concat of these thin pack slices together by writing a new object
header and computing the SHA-1 trailer as it goes. Instead of
computing actual graph connectivity, just concat packs together
between the base children and the requested tips. This probably
requires that the client ask for every branch (e.g. the typical
refs/heads/*:refs/remotes/origin/* refspec) and that branches didn't
rewind. But I think this is so common its perhaps worthwhile to look
into optimizing. But note we can do this in the native protocol at the
server side without telling the client anything, or changing the
protocol. It just isn't resumable without a bit more glue to have a
state marker available to the client. Nor does it work on a CDN
without giving the client more information. :-)

> So the sendfile() stuff would always happen over http.

I'm OK with that. I was just saying we may be able to also support
sendfile() over git:// if the repository owner / git-daemon owner
wants us to. Or if not sendfile(), a simple read-write loop that
doesn't have to look at the data, since the client will validate it
all.

> Yeah, I'm liking that idea. In reference to my [1] above, what I've
> started with is making:
>
>  git fetch http://host/foo.bundle

This should work, whether or not we use it for resumable clone. Its
just nice to have that tiny bit of extra glue to make it easy to pull
a bundle. So I'd like this too. :-)

> Pulling down a non-thin packfile makes the problem go away. We can spool
> it right into objects/pack, index it on the fly, and if all is well,
> move it into its final filename. If the transfer is interrupted, you
> drop what's been indexed so far, finish the transfer, and then re-start
> the indexing from scratch (actually, the "on the fly" would probably
> involve teaching index-pack to be clever about incrementally reading a
> partially written file, but it should be possible).

I wonder if we can teach index-pack to work with a thin pack on disk
and complete that by appending to the file, in addition to the
streaming from stdin it supports. Seems like that should be possible.
So then you could save a thin pack to a temp file on disk, and thus
could split a bundle header from its pack content, saving them into
two different temp files, allowing index-pack to avoid copying the
pack portion if its non-thin, or if its a huge thin pack.

I did think about doing this in "repo" and decided it was complex, and
not worth the effort. So we spool. 2G+ bundles. Its not the most
pleasant user experience. If I had more time, I would have tried to
split the bundle header from the pack and written the pack directly
off for index-pack to read from disk.

^ permalink raw reply

* Folder level Acces in git
From: redhat1981 @ 2011-11-03  6:10 UTC (permalink / raw)
  To: git

Hi,

I am using Gitosis, Adding the gitosis conf file

[group testabc]
writable = testabc
members =   shrii Abhijeet premkumar
add cgit = yes
gitweb = yes


[group testabc-readonly]
readonly = testabc
members =  Ganesh Shweta
add cgit = yes
gitweb = yes

Inside the repository, testabc let us say there are folders folder1, folder
2 etc, I want some users to have read/write, read or no access to the
folder1 or folder2, Is this possible in Git, I have done it in SVN, Please
help!!

redhat1981@gmail.com


--
View this message in context: http://git.661346.n2.nabble.com/Folder-level-Acces-in-git-tp6958047p6958047.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply

* Re: [PATCH] Add options to specify snapshot file name, prefix
From: Thomas Guyot-Sionnest @ 2011-11-03  6:28 UTC (permalink / raw)
  To: Thomas Guyot-Sionnest; +Cc: git
In-Reply-To: <1320298641-11149-1-git-send-email-dermoth@aei.ca>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Please disregard; i've actually managed to break the patch while playing
with it.

I will resubmit.

- --

Thomas

On 11-11-03 01:37 AM, Thomas Guyot-Sionnest wrote:
> commit b629275 implemented "smart" snapshot names and prefixes. I have
> scripts that used to rely on the old behaviour which allowed in some
> cases to have fixed prefix, and would require modifications to work with
> newer Gitweb.
> 
> This patch adds two parameters for overriding the snapshot name and
> prefix, sn and sp respectively. For example, to get a snapshot
> named "myproject.[suffix]" with no prefix one can add this query string:
>   ?sn=myproject;sp=
> 
> Signed-off-by: Thomas Guyot-Sionnest <dermoth@aei.ca>
> ---
>  gitweb/gitweb.perl |    8 ++++++++
>  1 files changed, 8 insertions(+), 0 deletions(-)
> 
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 4f0c3bd..4820ef7 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -756,6 +756,8 @@ our @cgi_param_mapping = (
>  	searchtext => "s",
>  	searchtype => "st",
>  	snapshot_format => "sf",
> +	snapshot_name => "sn",
> +	snapshot_prefix => "sp",
>  	extra_options => "opt",
>  	search_use_regexp => "sr",
>  	ctag => "by_tag",
> @@ -6684,6 +6686,12 @@ sub git_snapshot {
>  	}
>  
>  	my ($name, $prefix) = snapshot_name($project, $hash);
> +	if (defined($input_params{'snapshot_name'})) {
> +		$name = $input_params{'snapshot_name'};
> +	}
> +	if (defined($input_params{'snapshot_prefix'})) {
> +		$prefix = $input_params{'snapshot_prefix'};
> +	}
>  	my $filename = "$name$known_snapshot_formats{$format}{'suffix'}";
>  	my $cmd = quote_command(
>  		git_cmd(), 'archive',


- -- 
Thomas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk6yNJcACgkQ6dZ+Kt5BchZkCgCgvJ9kAF636HYCAwQ7LoWu8ckG
8+8Anj5ocii3c7vKQ2/eVjcAvDJxsvhQ
=jGwr
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: Folder level Acces in git
From: Magnus Bäck @ 2011-11-03  7:17 UTC (permalink / raw)
  To: redhat1981; +Cc: git
In-Reply-To: <1320300655224-6958047.post@n2.nabble.com>

On Thursday, November 03, 2011 at 07:10 CET,
     redhat1981 <redhat1981@gmail.com> wrote:

> I am using Gitosis, Adding the gitosis conf file
> 
> [group testabc]
> writable = testabc
> members =   shrii Abhijeet premkumar
> add cgit = yes
> gitweb = yes
> 
> 
> [group testabc-readonly]
> readonly = testabc
> members =  Ganesh Shweta
> add cgit = yes
> gitweb = yes
> 
> Inside the repository, testabc let us say there are folders folder1,
> folder 2 etc, I want some users to have read/write, read or no access
> to the folder1 or folder2, Is this possible in Git, I have done it in
> SVN, Please help!!

Given Git's nature, you can't have read access restrictions on a sub-git
level (i.e. file/directory level). For basically the same reason, you
can never prevent users from making (local) commits that modify certain
paths (but you can encourage people to have local hooks to enforce such
policies). What you *can* do is install a server-side update hook that
rejects attempts to push commits that modify certain paths. If you're
willing to trade Gitosis for Gitolite, you get that feature for free.

http://book.git-scm.com/5_git_hooks.html
http://progit.org/book/ch4-8.html

-- 
Magnus Bäck                   Opinions are my own and do not necessarily
SW Configuration Manager      represent the ones of my employer, etc.
Sony Ericsson

^ permalink raw reply

* [PATCH] fsck: print progress
From: Nguyễn Thái Ngọc Duy @ 2011-11-03  8:50 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Nguyễn Thái Ngọc Duy
In-Reply-To: <20111103033325.GA10230@sigill.intra.peff.net>

fsck is usually a long process and it would be nice if it prints
progress from time to time.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 2011/11/3 Jeff King <peff@peff.net>:
 > We're actually leaking some memory here, since stop_progress will also
 > free() the progress object and any associated resources. It's not a lot,
 > but it's kind of ugly.

 Fixed by always calling stop_progress_msg() but make sure no newline
 is printed (actually "done\n" is not)

 Also fixed "progress" initialization in fsck_object_dir()

 > Or perhaps we could even come up with a total object count
 > before starting.  I guess it would involve mapping each pack index
 > simultaneously, though by my reading of the code, I think we do that
 > anyway (the opened index is cached in the pack object).

 I think this way is better because we can count two numbers at a
 time, nr. packs and nr. objects of current pack. Total object
 (I assume you mean of all packs) may be less informative.

 Documentation/git-fsck.txt |   12 +++++++++-
 builtin/fsck.c             |   52 +++++++++++++++++++++++++++++++++++++++++--
 pack-check.c               |   11 ++++++--
 pack.h                     |    4 ++-
 progress.c                 |    2 +-
 5 files changed, 72 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index a2a508d..5245101 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -10,7 +10,8 @@ SYNOPSIS
 --------
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
-	 [--[no-]full] [--strict] [--verbose] [--lost-found] [<object>*]
+	 [--[no-]full] [--strict] [--verbose] [--lost-found]
+	 [--[no-]progress] [<object>*]
 
 DESCRIPTION
 -----------
@@ -72,6 +73,15 @@ index file, all SHA1 references in .git/refs/*, and all reflogs (unless
 	a blob, the contents are written into the file, rather than
 	its object name.
 
+--progress::
+--no-progress::
+	When fsck is run in a terminal, it will show the progress.
+	These options can force progress to be shown or not
+	regardless terminal check.
++
+Progress is not shown when --verbose is used. --progress is ignored
+in this case.
+
 It tests SHA1 and general object sanity, and it does full tracking of
 the resulting reachability and everything else. It prints out any
 corruption it finds (missing or bad objects), and if you use the
diff --git a/builtin/fsck.c b/builtin/fsck.c
index df1a88b..e19b78c 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -11,6 +11,7 @@
 #include "fsck.h"
 #include "parse-options.h"
 #include "dir.h"
+#include "progress.h"
 
 #define REACHABLE 0x0001
 #define SEEN      0x0002
@@ -27,6 +28,8 @@ static const char *head_points_at;
 static int errors_found;
 static int write_lost_and_found;
 static int verbose;
+static int show_progress = -1;
+
 #define ERROR_OBJECT 01
 #define ERROR_REACHABLE 02
 
@@ -512,15 +515,20 @@ static void get_default_heads(void)
 static void fsck_object_dir(const char *path)
 {
 	int i;
+	struct progress *progress = NULL;
 
 	if (verbose)
 		fprintf(stderr, "Checking object directory\n");
 
+	if (show_progress)
+		progress = start_progress("Checking object directories", 256);
 	for (i = 0; i < 256; i++) {
 		static char dir[4096];
 		sprintf(dir, "%s/%02x", path, i);
 		fsck_dir(i, dir);
+		display_progress(progress, i+1);
 	}
+	stop_progress(&progress);
 	fsck_sha1_list();
 }
 
@@ -591,6 +599,7 @@ static struct option fsck_opts[] = {
 	OPT_BOOLEAN(0, "strict", &check_strict, "enable more strict checking"),
 	OPT_BOOLEAN(0, "lost-found", &write_lost_and_found,
 				"write dangling objects in .git/lost-found"),
+	OPT_BOOL   (0, "progress", &show_progress, "show progress"),
 	OPT_END(),
 };
 
@@ -603,6 +612,12 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	read_replace_refs = 0;
 
 	argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0);
+
+	if (show_progress == -1)
+		show_progress = isatty(2);
+	if (verbose)
+		show_progress = 0;
+
 	if (write_lost_and_found) {
 		check_full = 1;
 		include_reflogs = 0;
@@ -622,20 +637,51 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	if (check_full) {
 		struct packed_git *p;
+		int i, nr_packs = 0;
+		struct progress *progress = NULL;
 
 		prepare_packed_git();
+
 		for (p = packed_git; p; p = p->next)
+			nr_packs++;
+
+		for (i = 1, p = packed_git; p; p = p->next, i++) {
+			if (show_progress) {
+				char buf[32];
+				snprintf(buf, sizeof(buf), "Verifying pack %d/%d",
+					 i, nr_packs);
+				if (open_pack_index(p))
+					continue;
+				progress = start_progress(buf, p->num_objects);
+			}
 			/* verify gives error messages itself */
-			verify_pack(p);
+			verify_pack(p, progress);
 
-		for (p = packed_git; p; p = p->next) {
+			if (p->next)
+				stop_progress_msg(&progress, NULL);
+		}
+		stop_progress(&progress);
+
+		for (i = 1, p = packed_git; p; p = p->next, i++) {
+			char buf[64];
 			uint32_t j, num;
 			if (open_pack_index(p))
 				continue;
 			num = p->num_objects;
-			for (j = 0; j < num; j++)
+
+			snprintf(buf, sizeof(buf), "Checking objects on pack %d/%d",
+				 i, nr_packs);
+			if (show_progress)
+				progress = start_progress(buf, num);
+			for (j = 0; j < num; j++) {
 				fsck_sha1(nth_packed_object_sha1(p, j));
+				display_progress(progress, j+1);
+			}
+
+			if (p->next)
+				stop_progress_msg(&progress, NULL);
 		}
+		stop_progress(&progress);
 	}
 
 	heads = 0;
diff --git a/pack-check.c b/pack-check.c
index 0c19b6e..e30c18c 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "pack.h"
 #include "pack-revindex.h"
+#include "progress.h"
 
 struct idx_entry {
 	off_t                offset;
@@ -42,7 +43,8 @@ int check_pack_crc(struct packed_git *p, struct pack_window **w_curs,
 }
 
 static int verify_packfile(struct packed_git *p,
-		struct pack_window **w_curs)
+			   struct pack_window **w_curs,
+			   struct progress *progress)
 {
 	off_t index_size = p->index_size;
 	const unsigned char *index_base = p->index_data;
@@ -126,7 +128,10 @@ static int verify_packfile(struct packed_git *p,
 			break;
 		}
 		free(data);
+		if ((i & 1023) == 0)
+			display_progress(progress, i);
 	}
+	display_progress(progress, i);
 	free(entries);
 
 	return err;
@@ -155,7 +160,7 @@ int verify_pack_index(struct packed_git *p)
 	return err;
 }
 
-int verify_pack(struct packed_git *p)
+int verify_pack(struct packed_git *p, struct progress *progress)
 {
 	int err = 0;
 	struct pack_window *w_curs = NULL;
@@ -164,7 +169,7 @@ int verify_pack(struct packed_git *p)
 	if (!p->index_data)
 		return -1;
 
-	err |= verify_packfile(p, &w_curs);
+	err |= verify_packfile(p, &w_curs, progress);
 	unuse_pack(&w_curs);
 
 	return err;
diff --git a/pack.h b/pack.h
index 722a54e..572794f 100644
--- a/pack.h
+++ b/pack.h
@@ -70,10 +70,12 @@ struct pack_idx_entry {
 	off_t offset;
 };
 
+struct progress;
+
 extern const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects, int nr_objects, const struct pack_idx_option *, unsigned char *sha1);
 extern int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, off_t len, unsigned int nr);
 extern int verify_pack_index(struct packed_git *);
-extern int verify_pack(struct packed_git *);
+extern int verify_pack(struct packed_git *, struct progress *);
 extern void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t);
 extern char *index_pack_lockfile(int fd);
 extern int encode_in_pack_object_header(enum object_type, uintmax_t, unsigned char *);
diff --git a/progress.c b/progress.c
index 3971f49..fc96a5f 100644
--- a/progress.c
+++ b/progress.c
@@ -245,7 +245,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg)
 	if (!progress)
 		return;
 	*p_progress = NULL;
-	if (progress->last_value != -1) {
+	if (progress->last_value != -1 && msg) {
 		/* Force the last update */
 		char buf[128], *bufp;
 		size_t len = strlen(msg) + 5;
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* How to find a commit that introduces (not removes) a string?
From: Sebastian Schuberth @ 2011-11-03  9:50 UTC (permalink / raw)
  To: git

Hi all,

I know about git log's -S / -G, but I'm unable to make these search through *introduced* strings only. Is there a way to do so?

Thanks!

PS: I also read [1], but although the author claims to be interested in introduced strings only, he seems to be satisfied with -G, which slightly puzzles me.

[1] http://stackoverflow.com/questions/5816134/git-finding-a-commit-that-introduced-a-string

-- 
Sebastian Schuberth

^ permalink raw reply

* [PATCH] log: support --full-diff=<pathspec>
From: Nguyễn Thái Ngọc Duy @ 2011-11-03 10:01 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

I wanted to check if any patches that update builtin/fsck.c also
update 't' directory and git seemed not support this case
(true?). With this I can do

  git log --patch --full-diff="'builtin/fsck.c' 't'" -- builtin/fsck.c

I guess this may be something people find useful.

This patch is a bit inconvenient though because <pathspec> is parsed
with sq_dequote_to_argv() and all arguments must be wrapped by ''.
Also "full-diff" name does not make much sense when it comes with
pathspec.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 revision.c |   19 +++++++++++++++++++
 revision.h |    1 +
 2 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/revision.c b/revision.c
index 8764dde..f508953 100644
--- a/revision.c
+++ b/revision.c
@@ -13,6 +13,7 @@
 #include "decorate.h"
 #include "log-tree.h"
 #include "string-list.h"
+#include "quote.h"
 
 volatile show_early_output_fn_t show_early_output;
 
@@ -1531,6 +1532,10 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--full-diff")) {
 		revs->diff = 1;
 		revs->full_diff = 1;
+	} else if (!prefixcmp(arg, "--full-diff=")) {
+		revs->diff = 1;
+		revs->full_diff = 1;
+		revs->full_diff_opt = arg + strlen("--full-diff=");
 	} else if (!strcmp(arg, "--full-history")) {
 		revs->simplify_history = 0;
 	} else if (!strcmp(arg, "--relative-date")) {
@@ -1819,6 +1824,20 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 			revs->prune = 1;
 		if (!revs->full_diff)
 			diff_tree_setup_paths(revs->prune_data.raw, &revs->diffopt);
+		else if (revs->full_diff_opt) {
+			const char **argv = NULL;
+			int alloc = 0, nr = 0;
+			char *arg;
+
+			arg = xstrdup(revs->full_diff_opt);
+			sq_dequote_to_argv(arg, &argv, &nr, &alloc);
+
+			ALLOC_GROW(argv, nr + 1, alloc);
+			argv[nr] = NULL;
+			argv = get_pathspec(revs->prefix, argv);
+
+			diff_tree_setup_paths(argv, &revs->diffopt);
+		}
 	}
 	if (revs->combine_merges)
 		revs->ignore_merges = 0;
diff --git a/revision.h b/revision.h
index 6aa53d1..baa709c 100644
--- a/revision.h
+++ b/revision.h
@@ -137,6 +137,7 @@ struct rev_info {
 	const char	*subject_prefix;
 	int		no_inline;
 	int		show_log_size;
+	const char      *full_diff_opt;
 
 	/* Filter by commit log message */
 	struct grep_opt	grep_filter;
-- 
1.7.3.1.256.g2539c.dirty

^ permalink raw reply related

* Re: [PATCH v3 2/7] invalidate_ref_cache(): take the submodule as parameter
From: Michael Haggerty @ 2011-11-03 10:23 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <7vmxczmrb0.fsf@alter.siamese.dyndns.org>

On 10/17/2011 08:00 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>> On 10/12/2011 09:19 PM, Junio C Hamano wrote:
>>> Michael Haggerty <mhagger@alum.mit.edu> writes:
>>>  ...
>>> Probably that is what all the existing callers want, but I would have
>>> expected that an existing feature would be kept, perhaps like this
>>> instead:
>>>
>>> 	if (!submodule) {
>>> 		struct ref_cache *c;
>>>                 for (c = ref_cache; c; c = c->next)
>>>                 	clear_ref_cache(c);
>>> 	} else {
>>> 		clear_ref_cache(get_ref_cache(submodule);
>>> 	}
>> ...
>> Your specific suggestion would not work because currently
>> submodule==NULL signifies the main module.  However, it would be easy to
>> add the few-line function when/if it is needed.
> 
> I think "submodule==NULL" is probably a mistake; "" would make more sense
> given that you are storing the string in name[FLEX_ARRAY] field.

Sorry I didn't respond to this earlier.

The public API convention (which predates my changes) is that "char
*submodule" arguments either point at the relative path to the submodule
or are NULL to denote the main module.  But since these are stored
internally in a name[FLEX_ARRAY] field, I have been using "" internally
to denote the main module.  I believe that everything is done correctly,
but I admit that the use of different conventions internally and
externally is a potential source of programming errors.

If this is viewed as something that needs changing, the easiest thing
would probably be to add a "const char *submodule" in the ref_cache data
structure that either contains NULL or points at the name field, and to
consistently use the convention that the main module must always be
denoted by NULL.  The space overhead would be negligible because the
number of ref_cache objects is limited to the number of submodules plus 1.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply

* Re: git-p4: problem with commit 97a21ca50ef8
From: Vitor Antunes @ 2011-11-03 11:04 UTC (permalink / raw)
  To: Michael Wookey; +Cc: Pete Wyckoff, Git Mailing List, Luke Diamand
In-Reply-To: <CAOk9v+_xaS_Y1m17TROOSjgiscT+QEJWbpZbAZFmh8_tAviF6Q@mail.gmail.com>

Hi Michael,

On Wed, Nov 2, 2011 at 10:42 PM, Michael Wookey <michaelwookey@gmail.com> wrote:
> I tried your suggested version of git-p4 (at rev 630fb678c46c) and
> unfortunately, the perforce repository fails to import. Firstly, there
> was a problem with importing UTF-16 encoded files, secondly the
> "apple" filetype files are still skipped.

I had no intention of directing you to try that version. Sorry for
misleading you on this.

I just found it interesting that P4's KB contains an article that
directs users to another version which isn't this one.

-- 
Vitor Antunes

^ permalink raw reply

* Re: [PATCH 1/2] git-gui: make config gui.warndetachedcommit a boolean
From: Bert Wesarg @ 2011-11-03 11:27 UTC (permalink / raw)
  To: Pat Thoyts; +Cc: Heiko Voigt, git, Bert Wesarg
In-Reply-To: <0f4995b5df707782c73ec83171fb2f512ae887ef.1319312280.git.bert.wesarg@googlemail.com>

On Sat, Oct 22, 2011 at 21:39, Bert Wesarg <bert.wesarg@googlemail.com> wrote:
> Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
> ---
>  lib/commit.tcl |    2 +-
>  lib/option.tcl |    1 +
>  2 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/lib/commit.tcl b/lib/commit.tcl
> index 372bed9..e27e148 100644
> --- a/lib/commit.tcl
> +++ b/lib/commit.tcl
> @@ -263,7 +263,7 @@ proc commit_commitmsg {curHEAD msg_p} {
>        global is_detached repo_config
>        global pch_error
>
> -       if {$is_detached && $repo_config(gui.warndetachedcommit)} {
> +       if {$is_detached && [is_config_true gui.warndetachedcommit]} {
>                set msg [mc "You are about to commit on a detached head.\
>  This is a potentially dangerous thing to do because if you switch\
>  to another branch you will loose your changes and it can be difficult\
> diff --git a/lib/option.tcl b/lib/option.tcl
> index 719103a..f7f866b 100644
> --- a/lib/option.tcl
> +++ b/lib/option.tcl
> @@ -156,6 +156,7 @@ proc do_options {} {
>                {i-0..99 gui.commitmsgwidth {mc "Commit Message Text Width"}}
>                {t gui.newbranchtemplate {mc "New Branch Name Template"}}
>                {c gui.encoding {mc "Default File Contents Encoding"}}
> +               {b gui.warndetachedcommit {mc "Warn before commiting to a detached head"}}
>                {s gui.stageuntracked {mc "Staging of untracked files"} {list "yes" "no" "ask"}}
>                } {
>                set type [lindex $option 0]

Pat,

if you're interessted in this patch, please fix the typo in the second
hunk, mentioning 'commiting'.

Also shouldn't this variable be called gui.warndetachedhead?

Thanks.

Bert

> --
> 1.7.7.908.g383b5
>
>

^ permalink raw reply

* error from 'git push' on v1.7.8-rc0
From: Stefan Näwe @ 2011-11-03 11:43 UTC (permalink / raw)
  To: gitlist, Junio C Hamano

I get errors from git push when trying to delete a (remote) branch:

$ ./bin-wrappers/git versiongit version 1.7.8.rc0
$ ./bin-wrappers/git push -q . :refs/heads/nogofatal: bad object
0000000000000000000000000000000000000000fatal: bad object
0000000000000000000000000000000000000000remote: warning: Allowing
deletion of corrupt ref.
I bisected this to: 52fed6e receive-pack: check connectivity before
concluding "git push"

$ git describe 52fed6e~1
v1.7.5.3-492-gf96400c

$ ./bin-wrappers/git version
git version 1.7.5.3.492.gf9640

$ ./bin-wrappers/git push -q . :refs/heads/nogo
remote: warning: Allowing deletion of corrupt ref.

It makes no difference if the branch to be deleted exists, or not.

Regards,
  Stefan
-- 
----------------------------------------------------------------
python -c "print '73746566616e2e6e6165776540676d61696c2e636f6d'.decode('hex')"

^ permalink raw reply

* Re: error from 'git push' on v1.7.8-rc0
From: Stefan Näwe @ 2011-11-03 11:49 UTC (permalink / raw)
  To: gitlist, Junio C Hamano
In-Reply-To: <CAJzBP5Q1_zX+H0jeBZNB81KLYAbtJWhUuHA3rf8CuW-_OSFXbg@mail.gmail.com>

On Thu, Nov 3, 2011 at 12:43 PM, Stefan Näwe <stefan.naewe@gmail.com> wrote:
> I get errors from git push when trying to delete a (remote) branch:
>
> $ ./bin-wrappers/git versiongit version 1.7.8.rc0
> $ ./bin-wrappers/git push -q . :refs/heads/nogofatal: bad object
$ ./bin-wrappers/git push -q . :refs/heads/nogo

(re-wrapped)

$ ./bin-wrappers/git push -q . :refs/heads/nogo
fatal: bad object 0000000000000000000000000000000000000000
fatal: bad object 0000000000000000000000000000000000000000
remote: warning: Allowing deletion of corrupt ref.


Stefan
-- 
----------------------------------------------------------------
python -c "print '73746566616e2e6e6165776540676d61696c2e636f6d'.decode('hex')"

^ permalink raw reply

* [PATCH] log: allow to specify diff pathspec in addition to prune pathspec
From: Nguyễn Thái Ngọc Duy @ 2011-11-03 12:15 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy
In-Reply-To: <1320314474-19536-1-git-send-email-pclouds@gmail.com>

Pathspec in "git log -p <pathspec>" is used for both commit pruning
and diff generation. If --full-diff is given, then diff pathspec is
reset to generate complete diff.

This patch gives more control to diff generation. The first pathspec
in "git log -p -- <pathspec> -- <pathspec>" is used as commit pruning
as usual. The second one is used for diff generation. So --full-diff
now is essentially "git log -p -- <pathspec> --".

This form requires specifying "--" twice. If a file name happens to be
"--", it may be misunderstood as the second "--" marker. This is an
unfortunate consequence for this syntax. Users can still use "./--" or
similar to workaround this.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Version 2. Now it looks more acceptable.

 Documentation/git-log.txt |    9 +++++++--
 revision.c                |   28 +++++++++++++++++++++++++---
 2 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt
index 249fc87..8e00dbc 100644
--- a/Documentation/git-log.txt
+++ b/Documentation/git-log.txt
@@ -9,7 +9,7 @@ git-log - Show commit logs
 SYNOPSIS
 --------
 [verse]
-'git log' [<options>] [<since>..<until>] [[\--] <path>...]
+'git log' [<options>] [<since>..<until>] [[\--] <path>... [\-- <path>...]]
 
 DESCRIPTION
 -----------
@@ -52,11 +52,12 @@ OPTIONS
 	commit was reached.
 
 --full-diff::
-	Without this flag, "git log -p <path>..." shows commits that
+	Without this flag, `git log -p <path>...` shows commits that
 	touch the specified paths, and diffs about the same specified
 	paths.  With this, the full diff is shown for commits that touch
 	the specified paths; this means that "<path>..." limits only
 	commits, and doesn't limit diff for those commits.
+	It is equivalent to `git log -p \-- <path>... \--`.
 +
 Note that this affects all diff-based output types, e.g. those
 produced by --stat etc.
@@ -76,6 +77,10 @@ produced by --stat etc.
 +
 To prevent confusion with options and branch names, paths may need to
 be prefixed with "\-- " to separate them from options or refnames.
++
+If the second "\--" is found, the following pathspec is used to limit
+diff generation. Note that this affects all diff-based output types,
+e.g. those produced by --stat etc.
 
 include::rev-list-options.txt[]
 
diff --git a/revision.c b/revision.c
index 8764dde..f560647 100644
--- a/revision.c
+++ b/revision.c
@@ -1682,20 +1682,37 @@ static int handle_revision_pseudo_opt(const char *submodule,
  */
 int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct setup_revision_opt *opt)
 {
-	int i, flags, left, seen_dashdash, read_from_stdin, got_rev_arg = 0;
+	int i, flags, left, read_from_stdin, got_rev_arg = 0;
+	int seen_dashdash, seen_second_dashdash;
 	struct cmdline_pathspec prune_data;
+	struct cmdline_pathspec diff_data;
 	const char *submodule = NULL;
 
 	memset(&prune_data, 0, sizeof(prune_data));
+	memset(&diff_data, 0, sizeof(diff_data));
 	if (opt)
 		submodule = opt->submodule;
 
 	/* First, search for "--" */
-	seen_dashdash = 0;
+	seen_dashdash = seen_second_dashdash = 0;
 	for (i = 1; i < argc; i++) {
+		int i2;
 		const char *arg = argv[i];
 		if (strcmp(arg, "--"))
 			continue;
+
+		/* Search for second "--" */
+		for (i2 = i + 1; i2 < argc; i2++) {
+			const char *arg = argv[i2];
+			if (strcmp(arg, "--"))
+				continue;
+			argv[i2] = NULL;
+			if (argv[i2 + 1])
+				append_prune_data(&diff_data, argv + i2 + 1);
+			seen_second_dashdash = 1;
+			break;
+		}
+
 		argv[i] = NULL;
 		argc = i;
 		if (argv[i + 1])
@@ -1817,7 +1834,12 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 		/* Can't prune commits with rename following: the paths change.. */
 		if (!DIFF_OPT_TST(&revs->diffopt, FOLLOW_RENAMES))
 			revs->prune = 1;
-		if (!revs->full_diff)
+		if (seen_second_dashdash) {
+			ALLOC_GROW(diff_data.path, diff_data.nr+1, diff_data.alloc);
+			diff_data.path[diff_data.nr++] = NULL;
+			diff_tree_setup_paths(diff_data.path, &revs->diffopt);
+		}
+		else if (!revs->full_diff)
 			diff_tree_setup_paths(revs->prune_data.raw, &revs->diffopt);
 	}
 	if (revs->combine_merges)
-- 
1.7.4.74.g639db

^ permalink raw reply related

* Re: git rev-parse --since=1970-01-01 does not work reliably
From: Nguyen Thai Ngoc Duy @ 2011-11-03 12:36 UTC (permalink / raw)
  To: Dmitry V. Levin; +Cc: git
In-Reply-To: <20111101124434.GB22229@altlinux.org>

2011/11/1 Dmitry V. Levin <ldv@altlinux.org>:
> BTW, the timezone specifier (UTC) in "git rev-parse --since='1970-01-01 UTC'"
> seems to be completely ignored by date string parser.

It takes this "00:00 1970-01-01 UTC"
-- 
Duy

^ permalink raw reply

* How to make "git push/pull" work in non-clone repo?
From: Hong-Ming Su @ 2011-11-03 14:59 UTC (permalink / raw)
  To: git

Hi,

I create a repo X with git init. After several commit in X, I clone a
bare repo Y from X.
I try to continue work in X, and push to/pull from Y. The command git
push and git pull fails. I see the error message but I do not know
which git command can fix that problem.
Then I clone Z from Y. git push/pull works in Z.
How to make "git push/pull" the same in X as in Z?

Thanks,
Hong-Ming

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox