* Re: Re* [PATCH 2/2] grep --no-index: don't use git standard exclusions
From: Bert Wesarg @ 2011-09-28 19:19 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vty7xwrsf.fsf_-_@alter.siamese.dyndns.org>
On Wed, Sep 28, 2011 at 00:21, Junio C Hamano <gitster@pobox.com> wrote:
> Bert Wesarg <bert.wesarg@googlemail.com> writes:
>
>> Would '--untracked-too' only be a synonym for '--no-index
>> --exclude-standard', i.e. the current behavior?
>
> That basically would be the idea. Perhaps something like this on top of
> a9e6436 (grep --no-index: don't use git standard exclusions, 2011-09-15).
>
> -- >8 --
> Subject: [PATCH 1/2] grep: teach --untracked and --exclude options
I would still vote to name the option --exclude-standard, like it is
done in 'git ls-files'. Which also has a --exclude=<pattern> option.
And I think a --exclude=<pattern> option would be useful for 'git
grep', too.
Else:
Acked-by: Bert Wesarg <bert.wesarg@googlemail.com>
Thanks.
Bert
>
> In a working tree of a git managed repository, "grep --untracked" would
> find the specified patterns from files in untracked files in addition to
> its usual behaviour of finding them in the tracked files.
>
> By default, when working with "--no-index" option, "grep" does not pay
> attention to .gitignore mechanism. "grep --no-index --exclude" can be
> used to tell the command to use .gitignore and stop reporting hits from
> files that would be ignored. Also, when working without "--no-index",
> "grep" honors .gitignore mechanism, and "grep --no-exclude" can be used
> to tell the command to include hits from files that are ignored.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> Documentation/git-grep.txt | 15 ++++++++++++++-
> builtin-grep.c | 25 ++++++++++++++++++-------
> 2 files changed, 32 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
> index e019e76..2ccfb90 100644
> --- a/Documentation/git-grep.txt
> +++ b/Documentation/git-grep.txt
> @@ -9,7 +9,7 @@ git-grep - Print lines matching a pattern
> SYNOPSIS
> --------
> [verse]
> -'git grep' [--cached]
> +'git grep' [--cached] [--untracked] [--excludes]
> [-a | --text] [-I] [-i | --ignore-case] [-w | --word-regexp]
> [-v | --invert-match] [-h|-H] [--full-name]
> [-E | --extended-regexp] [-G | --basic-regexp]
> @@ -36,6 +36,19 @@ OPTIONS
> Instead of searching in the working tree files, check
> the blobs registered in the index file.
>
> +--untracked::
> + In addition to searching in the tracked files in the working
> + tree, search also in untracked files.
> +
> +--no-excludes::
> + Also search in ignored files by not honoring the `.gitignore`
> + mechanism. Only useful with `--untracked`.
> +
> +--excludes::
> + Do not pay attention to ignored files specified via the `.gitignore`
> + mechanism. Only useful when searching files in the current directory
> + with `--no-index`.
> +
> -a::
> --text::
> Process binary files as if they were text.
> diff --git a/builtin-grep.c b/builtin-grep.c
> index a10946d..c6cfdf8 100644
> --- a/builtin-grep.c
> +++ b/builtin-grep.c
> @@ -646,12 +646,14 @@ static int grep_object(struct grep_opt *opt, const char **paths,
> die("unable to grep from object of type %s", typename(obj->type));
> }
>
> -static int grep_directory(struct grep_opt *opt, const char **paths)
> +static int grep_directory(struct grep_opt *opt, const char **paths, int exc_std)
> {
> struct dir_struct dir;
> int i, hit = 0;
>
> memset(&dir, 0, sizeof(dir));
> + if (exc_std)
> + setup_standard_excludes(&dir);
>
> fill_directory(&dir, paths);
> for (i = 0; i < dir.nr; i++) {
> @@ -749,7 +751,7 @@ static int help_callback(const struct option *opt, const char *arg, int unset)
> int cmd_grep(int argc, const char **argv, const char *prefix)
> {
> int hit = 0;
> - int cached = 0;
> + int cached = 0, untracked = 0, opt_exclude = -1;
> int seen_dashdash = 0;
> int external_grep_allowed__ignored;
> struct grep_opt opt;
> @@ -764,6 +766,10 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
> { OPTION_BOOLEAN, 0, "index", &use_index, NULL,
> "finds in contents not managed by git",
> PARSE_OPT_NOARG | PARSE_OPT_NEGHELP },
> + OPT_BOOLEAN(0, "untracked", &untracked,
> + "search in both tracked and untracked files"),
> + OPT_SET_INT(0, "exclude", &opt_exclude,
> + "search also in ignored files", 1),
> OPT_GROUP(""),
> OPT_BOOLEAN('v', "invert-match", &opt.invert,
> "show non-matching lines"),
> @@ -950,18 +956,23 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
> paths[1] = NULL;
> }
>
> - if (!use_index) {
> + if (!use_index && (untracked || cached))
> + die("--cached or --untracked cannot be used with --no-index.");
> +
> + if (!use_index || untracked) {
> int hit;
> - if (cached)
> - die("--cached cannot be used with --no-index.");
> + int use_exclude = (opt_exclude < 0) ? use_index : !!opt_exclude;
> if (list.nr)
> - die("--no-index cannot be used with revs.");
> - hit = grep_directory(&opt, paths);
> + die("--no-index or --untracked cannot be used with revs.");
> + hit = grep_directory(&opt, paths, use_exclude);
> if (use_threads)
> hit |= wait_all();
> return !hit;
> }
>
> + if (0 <= opt_exclude)
> + die("--exclude or --no-exclude cannot be used for tracked contents.");
> +
> if (!list.nr) {
> int hit;
> if (!cached)
> --
> 1.7.7.rc3.4.g8d714
>
>
>
^ permalink raw reply
* Re: Git is not scalable with too many refs/*
From: Martin Fick @ 2011-09-28 19:38 UTC (permalink / raw)
To: Christian Couder; +Cc: git, Christian Couder, Thomas Rast, Julian Phillips
In-Reply-To: <CAP8UFD3TWQHU0wLPuxMDnc3bRSz90Yd+yDMBe03kofeo-nr7yA@mail.gmail.com>
On Monday, September 26, 2011 06:41:04 am Christian Couder
wrote:
> On Sun, Sep 25, 2011 at 10:43 PM, Martin Fick
<mfick@codeaurora.org> wrote:
...
> > git checkout
> >
> > can also take rather long periods of time > 3 mins when
> > run on a repo with ~100K refs.
...
> > So, I bisected this issue also, and it seems that the
> > "offending" commit is
...
> > commit 680955702990c1d4bfb3c6feed6ae9c6cb5c3c07
> > Author: Christian Couder <chriscool@tuxfamily.org>
> >
> > replace_object: add mechanism to replace objects
> > found in "refs/replace/"
...
> I don't think there is an obvious problem with it, but it
> would be nice if you could dig a bit deeper.
>
> The first thing that could take a lot of time is the call
> to for_each_replace_ref() in this function:
>
> +static void prepare_replace_object(void)
> +{
> + static int replace_object_prepared;
> +
> + if (replace_object_prepared)
> + return;
> +
> + for_each_replace_ref(register_replace_ref, NULL);
> + replace_object_prepared = 1;
> +}
The time was actually spent in for_each_replace_ref()
which calls get_loose_refs() which has the recursive bug
that Julian Phillips fixed 2 days ago. Good to see that
this fix helps other use cases too.
So with that bug fixed, the thing taking the most time now
for a git checkout with ~100K refs seems to be the orphan
check as Thomas predicted. The strange part with this, is
that the orphan check seems to take only about ~20s in the
repo where the refs aren't packed. However, in the repo
where they are packed, this check takes at least 5min! This
seems a bit unusual, doesn't it? Is the filesystem that
much better at indexing refs than git's pack mechanism?
Seems unlikely, the unpacked refs take 312M in the FS, the
packed ones only take about 4.3M. I suspect their is
something else unexpected going on here in the packed ref
case.
Any thoughts? I will dig deeper...
-Martin
--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum
^ permalink raw reply
* Re: [PATCH 1/3] parseopt: add OPT_NOOP_NOARG
From: Junio C Hamano @ 2011-09-28 19:47 UTC (permalink / raw)
To: René Scharfe; +Cc: git, Pierre Habouzit
In-Reply-To: <4E835CFE.7020501@lsrfire.ath.cx>
Thanks.
^ permalink raw reply
* Re: Re* [PATCH 2/2] grep --no-index: don't use git standard exclusions
From: Junio C Hamano @ 2011-09-28 20:01 UTC (permalink / raw)
To: Bert Wesarg; +Cc: git
In-Reply-To: <CAKPyHN3=mbtEkyFUBdZAJCEVXfJhwPVhVFWkNNfX-yhtw9498w@mail.gmail.com>
Bert Wesarg <bert.wesarg@googlemail.com> writes:
> I would still vote to name the option --exclude-standard, like it is
> done in 'git ls-files'.
Thanks for catching.
^ permalink raw reply
* Re: [PATCH v3] Docs: git checkout --orphan: `root commit' and `branch head'
From: Junio C Hamano @ 2011-09-28 20:34 UTC (permalink / raw)
To: Michael Witten
Cc: Carlos Martín Nieto, vra5107, Michael J Gruber, Matthieu Moy,
Eric Raible, Philip Oakley, Jeff King, Jay Soffian, git
In-Reply-To: <e4f46b39e9ed4203bfab8a81e25eb600-mfwitten@gmail.com>
Michael Witten <mfwitten@gmail.com> writes:
> See:
>
> Re: Can a git changeset be created with no parent
> Carlos Martín Nieto <cmn@elego.de>
> Message-ID: <1317073309.5579.9.camel@centaur.lab.cmartin.tk>
> http://article.gmane.org/gmane.comp.version-control.git/182170
>
> and:
>
> git help glossary
I think I had to tell somebody on this list not to do this no more than a
month ago.
It is a good practice to point to earlier discussions while polishing
patch, and it also is good to include pointers in the commit log message
as a _supporting material_ (additional reading), but that is *NOT* a
substitute for a properly written commit log message. You need to state
what problem you are trying to fix and how the proposed patch fixes it.
Pretty please.
As to what the updated text wants to talk about, I think most of them are
improvement, but there are a few glitches and nits.
> diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
> index c0a96e6..0b6e528 100644
> --- a/Documentation/git-checkout.txt
> +++ b/Documentation/git-checkout.txt
> @@ -125,29 +125,54 @@ explicitly give a name with '-b' in such a case.
> below for details.
>
> ---orphan::
> - Create a new 'orphan' branch, named <new_branch>, started from
> - <start_point> and switch to it. The first commit made on this
> - new branch will have no parents and it will be the root of a new
> - history totally disconnected from all the other branches and
> - commits.
> -+
> -The index and the working tree are adjusted as if you had previously run
> -"git checkout <start_point>". This allows you to start a new history
> -that records a set of paths similar to <start_point> by easily running
> -"git commit -a" to make the root commit.
> -+
> -This can be useful when you want to publish the tree from a commit
> -without exposing its full history. You might want to do this to publish
> -an open source branch of a project whose current tree is "clean", but
> -whose full history contains proprietary or otherwise encumbered bits of
> -code.
> -+
> -If you want to start a disconnected history that records a set of paths
> -that is totally different from the one of <start_point>, then you should
> -clear the index and the working tree right after creating the orphan
> -branch by running "git rm -rf ." from the top level of the working tree.
> -Afterwards you will be ready to prepare your new files, repopulating the
> -working tree, by copying them from elsewhere, extracting a tarball, etc.
> +--orphan::
> + Tell git to turn the next commit you create into a root commit
> + (that is, a commit without any parent); creating the next commit
> + is similar to creating the first commit after running "git{nbsp}init",
I do not think we ever used {nbsp} in our documentation set. Is it
absolutely necessary to use it in this context, and are you absolutely
sure this does not break various versions of documentation toolchain
people have?
> + except that the new commit will be referenced by the branch head
> + <new_branch> rather than "master".
The option works as a substitute for -B/-b in the sense that it takes a
name of the new branch, and the only difference is that the new branch
will start with no commit (yet). To reduce confusion, it might make sense
to update this part (and description of -b/-B) to begin like this:
--orphan <new_branch>::
to match the new explanation text, and the output from "git checkout -h".
By the way, shouldn't the entry for -b in "git checkout -h" output also
say <new branch>?
Micronit: "referenced by the branch head <new_branch>" might be easier to
read and understand with s/branch head/branch name/. Or perhaps
except that the new commit will be made on <new_branch> branch
rather than "master".
might be even better.
> ++
> +Furthermore, the working tree and index are adjusted as if you ran
> +"git{nbsp}checkout{nbsp}<start_point>"; thus, by just running
> +"git{nbsp}commit", you can create a root commit with a tree that is
> +exactly the same as the tree of <start_point>.
> ++
> +Naturally, before creating the commit, you may manipulate the index
> +in any way you want. ...
What value does "Naturally, " add to the understanding here? I would
understand if it were "Alternatively, ", though.
> +... For example, if you want to create a root commit
> +with a tree that is totally different from the tree of <start_point>,
> +then just clear the working tree and index first: From the top level
> +of the working tree, run "git{nbsp}rm{nbsp}-rf{nbsp}.", and then
> +prepare your new working tree and index as desired.
> ++
> +There are two common uses for this option:
> ++
> +--
> + Separate history::
> + Suppose that for convenience, you want to maintain
> + the website for your project in the same repository
> + as the project itself. In such a case, it may not
> + make much sense to interleave the history of the
> + website with the history of the project; you can use
> + the "--orphan" option in order to create these two
> + completely separate histories.
I suspect this is *not* common at all, and also because you would need a
working tree to update both lines of histories that are not related to
each other at all at the content level, I do not believe that "for
convenience" supports or justifies the use case at all. It would be less
convenient to update these two unrelated histories (switching between
branches would need to nuke the whole working tree, and the semantics to
carry local changes floating on top of the tip commit of the current
branch across branch switching actively works against you).
You would be better off using another repository to keep track of "the
website for your project".
In short, I do not think we would want to list the above as if we are
somehow encouraging the use of this option for such a use. It falls into
"because with --orphan you _could_", and definitely not "because having
these two unrelated projects in the same repository you work in is
convenient and/or necessary".
> + Hidden history::
> + Suppose you have a project that has proprietary
> + material that is never meant to be released to the
> + public, yet you now want to maintain an open source
> + history that may be published widely.
This cause does make sense; you would want the local changes floating on
top of the tip commit of the current branch carried across branch
switching.
> +In this case, it would not be enough just to remove the proprietary
> +material from the working tree and then create a new commit, because
> +the proprietary material would still be accessible through the new
> +commit's ancestry; the proprietary history must be hidden from the new
> +commit, and the "--orphan" option allows you to do so by ensuring that
> +the new commit has no parent.
Does this "In this case" paragraph format as part of the "Hidden history"
paragraph?
> ++
> +However, removing proprietary material from ancestry is usually a task
> +that is better performed by linkgit:git-filter-branch[1] and
> +linkgit:git-rebase[1], especially when there are multiple commits that
> +are already suitable for the open source history.
> +--
In general it is true, and not "especially". When you are just abandoning
the history and publishing tidied up tree without history afresh.
I suspect this is not exactly "filter-branch is better but you could use
checkout --orphan" like you made it sound in the above paragraph. If you
are bootstrapping a new open source project from the tip of a proprietary
tree, "checkout --orphan && edit to sanitize && commit" to start your
history afresh would be perfectly adequate for your PR people to say "Now
we are open", especially when you do not intend to accept fixes to older
revisions; you could filter-branch the older proprietary history but you
would not get much benefit out of the cost for doing so, I think.
Thanks.
^ permalink raw reply
* Re: [PATCH v3] send-email: auth plain/login fix
From: Junio C Hamano @ 2011-09-28 22:00 UTC (permalink / raw)
To: Zbigniew Jędrzejewski-Szmek; +Cc: joe, git
In-Reply-To: <1317205600-7210-1-git-send-email-zbyszek@in.waw.pl>
Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> writes:
> This patch is tested by sending it :)
I am tempted to shorten the log message to cull the protocol trace.
send-email: auth plain/login fix
git send-email does not authenticate properly when communicating over TLS
with a server supporting only AUTH PLAIN and AUTH LOGIN. This is e.g. the
standard server setup under debian with exim4 and probably everywhere
where system accounts are used.
The problem (only?) exists when libauthen-sasl-cyrus-perl (Authen::SASL::Cyrus)
is installed. Importing Authen::SASL::Perl makes Authen::SASL use the perl
implementation, which works better.
The solution is based on this forum thread:
http://www.perlmonks.org/?node_id=904354
This patch is tested by sending it. Without this fix, the interaction with
the server failed like this:
...
Password:
Net::SMTP::SSL=GLOB(0x238f668)>>> AUTH
Net::SMTP::SSL=GLOB(0x238f668)<<< 501 5.5.2 AUTH mechanism must be specified
5.5.2 AUTH mechanism must be specified
Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Around the line "the server failed like this:", it would be helpful if we
can say how others can reproduce the protocol exchange log shown above.
That would help those who may (or may not) be seeing a similar issue to
diagnose if this commit may help them (or is the culprit of a breakage
they find in the future).
Thanks.
^ permalink raw reply
* Re: Git is not scalable with too many refs/*
From: Martin Fick @ 2011-09-28 22:10 UTC (permalink / raw)
To: Christian Couder; +Cc: git, Christian Couder, Thomas Rast, Julian Phillips
In-Reply-To: <201109281338.04378.mfick@codeaurora.org>
On Wednesday, September 28, 2011 01:38:04 pm Martin Fick
wrote:
> On Monday, September 26, 2011 06:41:04 am Christian
> Couder
>
> wrote:
> > On Sun, Sep 25, 2011 at 10:43 PM, Martin Fick
>
> <mfick@codeaurora.org> wrote:
> ...
>
> > > git checkout
> > >
> > > can also take rather long periods of time > 3 mins
> > > when run on a repo with ~100K refs.
>
> ...
>
> > > So, I bisected this issue also, and it seems that
> > > the
> > >
> > > "offending" commit is
>
> ...
>
> > > commit 680955702990c1d4bfb3c6feed6ae9c6cb5c3c07
> > > Author: Christian Couder <chriscool@tuxfamily.org>
> > >
> > > replace_object: add mechanism to replace objects
> > >
> > > found in "refs/replace/"
>
> ...
>
> > I don't think there is an obvious problem with it, but
> > it would be nice if you could dig a bit deeper.
> >
> > The first thing that could take a lot of time is the
> > call to for_each_replace_ref() in this function:
> >
> > +static void prepare_replace_object(void)
> > +{
> > + static int replace_object_prepared;
> > +
> > + if (replace_object_prepared)
> > + return;
> > +
> > + for_each_replace_ref(register_replace_ref,
> > NULL); + replace_object_prepared = 1;
> > +}
>
> The time was actually spent in for_each_replace_ref()
> which calls get_loose_refs() which has the recursive bug
> that Julian Phillips fixed 2 days ago. Good to see that
> this fix helps other use cases too.
>
> So with that bug fixed, the thing taking the most time
> now for a git checkout with ~100K refs seems to be the
> orphan check as Thomas predicted. The strange part with
> this, is that the orphan check seems to take only about
> ~20s in the repo where the refs aren't packed. However,
> in the repo where they are packed, this check takes at
> least 5min! This seems a bit unusual, doesn't it? Is
> the filesystem that much better at indexing refs than
> git's pack mechanism? Seems unlikely, the unpacked refs
> take 312M in the FS, the packed ones only take about
> 4.3M. I suspect their is something else unexpected
> going on here in the packed ref case.
>
> Any thoughts? I will dig deeper...
I think the problem is that resolve_ref() walks a linked
list of searching for the packed ref. Does this mean that
packed refs are not indexed at all?
>
> -Martin
--
Employee of Qualcomm Innovation Center, Inc. which is a
member of Code Aurora Forum
^ permalink raw reply
* Re: Lack of detached signatures
From: Jeff King @ 2011-09-28 22:25 UTC (permalink / raw)
To: Junio C Hamano
Cc: Joseph Parmelee, Carlos Martín Nieto, Olsen, Alan R,
Michael Witten, git@vger.kernel.org
In-Reply-To: <7v1uv01uqm.fsf@alter.siamese.dyndns.org>
On Wed, Sep 28, 2011 at 09:45:21AM -0700, Junio C Hamano wrote:
> The world is not so blank-and-white. Trust is ultimately among humans. If
> this message is not from the real Junio, don't you think you will hear
> something like "No, that c6ba05... is forgery, please don't use it!" from
> him, when he finds this message on the Git mailing list? If he does not
> exercise diligence to even do that much, does he deserve your trust in the
> first place?
>
> GPG does add security (if you have the key) but you can do pretty well
> even without it in practice.
Your suggestion above is something like an audit trail. It doesn't
prevent all mischief from happening, but after it happens and is
noticed, we can do some analysis, figuring out what happened and how to
clean up. Banks do this all the time with transactions.
At the same time, banks don't rely solely on an audit trail. They also
have up-front mechanisms, like passwords and ATM secrets, that help
prevent mischief from happening in the first place. And when they fail,
we fall back to the audit trail.
So having preventative mechanisms and audit mechanisms is not an
either-or situation. They complement each other; the strengths of one
can help when the other fails.
In this case, I think signed tarballs would be a nice complement to the
natural human audit trail. It can stop some attacks early, without
having to worry about the effort of analyzing and cleaning up after the
fact. Can the signature be wrong, or be checked improperly? Of course.
If you realize your machine has been hacked and your key stolen, then
you let everybody know and we fall back to auditing what has already
happened.
Each mechanism you put in place has a cost, of course. And it's worth
thinking about whether that cost is worthwhile. But it really is as easy
as running "gpg --detach-sign" when you upload a release, isn't it? You
already do something similar with the signed tags. So the cost of the
mechanism is quite low.
> I do not think that is true at all. Developers just dropped *.tar.gz on a
> 'master' machine, and left the rest to a cron job that reflates the
> tarball into *.tar.bz2, sign both using a GPG key, and mirror them to the
> public-facing machines 'www'.
>
> Somebody who had access to the 'master' machine could add a new tarball
> and have it go thru the same exact process, getting signed by the cron.
Right. In theory the master machine is harder to hack than the public
facing mirrors, but in this case it was not. Every link in the trust
chain introduces new possibilities for failure.
Given that git releases are all made by you, why not just sign them
locally with the same key you use to sign the release tags? It does mean
you have to generate and upload the .bz2 yourself[1], but is that really
that big a deal?
-Peff
[1] This is a minor nit, and probably not worth breaking away from the
way the rest of the world does it, but it is somewhat silly to sign the
compressed data. I couldn't care less about the exact bytes in the
compressed version; what I care about is the actual tar file. The
compression is just a transport.
^ permalink raw reply
* Re: [PATCH/RFCv3 2/2] receive-pack: don't pass non-existent refs to post-{receive,update} hooks in push deletions
From: Junio C Hamano @ 2011-09-28 23:09 UTC (permalink / raw)
To: Pang Yan Han
Cc: git, Sitaram Chamarty, Shawn O. Pearce, Jeff King,
Johannes Schindelin
In-Reply-To: <20110928153935.GA7800@myhost>
Here is a quick fix-up on top.
Points of interest:
- Get rid of an unnecessary list; just a single bitfield is sufficient;
while at it, make skip_update also a bitfield.
- ref_exists(refname) calls resolve_ref() which wants "const char *",
and does not touch refname itself. Make it to take "const char *",
instead of casting const away in the new caller.
- Properly indent test script. Also:
- Use "cmd <<-EOF" to allow indenting the here text with leading tabs;
- Strip useless use of single quote around "cmd <<'EOF'" inside
test_expect_success; they were only stepping out of the sq context
while the third argument to test_expect_success was being parsed. A
construct like this:
cat >pre-receive.expect <<'EOF'
$orgmaster $newmaster refs/heads/master
EOF
only misleads the readers as if the file is getting literal variable
names without substitution, which is not what is going on.
I didn't fix this in this fix-up patch, but you also need to test the case
where _ONLY_ deletion of a non-existent ref is requested, in which case,
pre-receive and update should be told about it, but post-receive and
post-update should not be even run (i.e. test the absense of these
post-*.actual files).
builtin/receive-pack.c | 41 ++++++---------------
refs.c | 2 +-
refs.h | 2 +-
t/t5516-fetch-push.sh | 90 ++++++++++++++++++++++++-----------------------
4 files changed, 60 insertions(+), 75 deletions(-)
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 8a0a9d2..b73f18a 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -147,32 +147,13 @@ static void write_head_info(void)
struct command {
struct command *next;
const char *error_string;
- unsigned int skip_update;
+ unsigned int skip_update:1,
+ did_not_exist:1;
unsigned char old_sha1[20];
unsigned char new_sha1[20];
char ref_name[FLEX_ARRAY]; /* more */
};
-/* For invalid refs */
-static struct command **invalid_delete;
-static size_t invalid_delete_nr;
-static size_t invalid_delete_alloc;
-
-static void invalid_delete_append(struct command *cmd)
-{
- ALLOC_GROW(invalid_delete, invalid_delete_nr + 1, invalid_delete_alloc);
- invalid_delete[invalid_delete_nr++] = cmd;
-}
-
-static int is_invalid_delete(struct command *cmd)
-{
- size_t i;
- for (i = 0; i < invalid_delete_nr; i++)
- if (invalid_delete[i] == cmd)
- return 1;
- return 0;
-}
-
static const char pre_receive_hook[] = "hooks/pre-receive";
static const char post_receive_hook[] = "hooks/post-receive";
@@ -235,7 +216,7 @@ static int run_receive_hook(struct command *commands, const char *hook_name)
int have_input = 0, code;
for (cmd = commands; !have_input && cmd; cmd = cmd->next) {
- if (!cmd->error_string && !is_invalid_delete(cmd))
+ if (!cmd->error_string && !cmd->did_not_exist)
have_input = 1;
}
@@ -268,7 +249,7 @@ static int run_receive_hook(struct command *commands, const char *hook_name)
}
for (cmd = commands; cmd; cmd = cmd->next) {
- if (!cmd->error_string && !is_invalid_delete(cmd)) {
+ if (!cmd->error_string && !cmd->did_not_exist) {
size_t n = snprintf(buf, sizeof(buf), "%s %s %s\n",
sha1_to_hex(cmd->old_sha1),
sha1_to_hex(cmd->new_sha1),
@@ -465,10 +446,13 @@ static const char *update(struct command *cmd)
if (is_null_sha1(new_sha1)) {
if (!parse_object(old_sha1)) {
- rp_warning("Allowing deletion of corrupt ref.");
old_sha1 = NULL;
- if (!ref_exists((char *) name))
- invalid_delete_append(cmd);
+ if (ref_exists(name)) {
+ rp_warning("Allowing deletion of corrupt ref.");
+ } else {
+ rp_warning("Deleting a non-existent ref.");
+ cmd->did_not_exist = 1;
+ }
}
if (delete_ref(namespaced_name, old_sha1, 0)) {
rp_error("failed to delete %s", name);
@@ -499,7 +483,7 @@ static void run_update_post_hook(struct command *commands)
struct child_process proc;
for (argc = 0, cmd = commands; cmd; cmd = cmd->next) {
- if (cmd->error_string || is_invalid_delete(cmd))
+ if (cmd->error_string || cmd->did_not_exist)
continue;
argc++;
}
@@ -510,7 +494,7 @@ static void run_update_post_hook(struct command *commands)
for (argc = 1, cmd = commands; cmd; cmd = cmd->next) {
char *p;
- if (cmd->error_string || is_invalid_delete(cmd))
+ if (cmd->error_string || cmd->did_not_exist)
continue;
p = xmalloc(strlen(cmd->ref_name) + 1);
strcpy(p, cmd->ref_name);
@@ -888,6 +872,5 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
}
if (use_sideband)
packet_flush(1);
- free(invalid_delete);
return 0;
}
diff --git a/refs.c b/refs.c
index a615043..43e9225 100644
--- a/refs.c
+++ b/refs.c
@@ -1851,7 +1851,7 @@ int update_ref(const char *action, const char *refname,
return 0;
}
-int ref_exists(char *refname)
+int ref_exists(const char *refname)
{
unsigned char sha1[20];
return !!resolve_ref(refname, sha1, 1, NULL);
diff --git a/refs.h b/refs.h
index dfb086e..4431205 100644
--- a/refs.h
+++ b/refs.h
@@ -57,7 +57,7 @@ extern void warn_dangling_symref(FILE *fp, const char *msg_fmt, const char *refn
*/
extern void add_extra_ref(const char *refname, const unsigned char *sha1, int flags);
extern void clear_extra_refs(void);
-extern int ref_exists(char *);
+extern int ref_exists(const char *);
extern int peel_ref(const char *, unsigned char *);
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index c0d8a0e..47db4b1 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -43,33 +43,34 @@ mk_test () {
mk_test_with_hooks() {
mk_test "$@" &&
(
- cd testrepo &&
- mkdir .git/hooks &&
- cd .git/hooks &&
+ cd testrepo &&
+ mkdir .git/hooks &&
+ cd .git/hooks &&
- cat >pre-receive <<'EOF' &&
-#!/bin/sh
-cat - >>pre-receive.actual
-EOF
+ cat >pre-receive <<-'EOF' &&
+ #!/bin/sh
+ cat - >>pre-receive.actual
+ EOF
- cat >update <<'EOF' &&
-#!/bin/sh
-printf "%s %s %s\n" "$@" >>update.actual
-EOF
- cat >post-receive <<'EOF' &&
-#!/bin/sh
-cat - >>post-receive.actual
-EOF
+ cat >update <<-'EOF' &&
+ #!/bin/sh
+ printf "%s %s %s\n" "$@" >>update.actual
+ EOF
- cat >post-update <<'EOF' &&
-#!/bin/sh
-for ref in "$@"
-do
- printf "%s\n" "$ref" >>post-update.actual
-done
-EOF
+ cat >post-receive <<-'EOF' &&
+ #!/bin/sh
+ cat - >>post-receive.actual
+ EOF
- chmod u+x pre-receive update post-receive post-update
+ cat >post-update <<-'EOF' &&
+ #!/bin/sh
+ for ref in "$@"
+ do
+ printf "%s\n" "$ref" >>post-update.actual
+ done
+ EOF
+
+ chmod +x pre-receive update post-receive post-update
)
}
@@ -599,31 +600,32 @@ test_expect_success 'pushing valid refs triggers post-receive and post-update ho
orgnext=$(cd testrepo && git show-ref -s --verify refs/heads/next) &&
newnext=$_z40 &&
git push testrepo refs/heads/master:refs/heads/master :refs/heads/next &&
- (cd testrepo/.git &&
- cat >pre-receive.expect <<'EOF' &&
-$orgmaster $newmaster refs/heads/master
-$orgnext $newnext refs/heads/next
-EOF
+ (
+ cd testrepo/.git &&
+ cat >pre-receive.expect <<-EOF &&
+ $orgmaster $newmaster refs/heads/master
+ $orgnext $newnext refs/heads/next
+ EOF
- cat >update.expect <<'EOF' &&
-refs/heads/master $orgmaster $newmaster
-refs/heads/next $orgnext $newnext
-EOF
+ cat >update.expect <<-EOF &&
+ refs/heads/master $orgmaster $newmaster
+ refs/heads/next $orgnext $newnext
+ EOF
- cat >post-receive.expect <<'EOF' &&
-$orgmaster $newmaster refs/heads/master
-$orgnext $newnext refs/heads/next
-EOF
+ cat >post-receive.expect <<-EOF &&
+ $orgmaster $newmaster refs/heads/master
+ $orgnext $newnext refs/heads/next
+ EOF
- cat >post-update.expect <<'EOF' &&
-refs/heads/master
-refs/heads/next
-EOF
+ cat >post-update.expect <<-EOF &&
+ refs/heads/master
+ refs/heads/next
+ EOF
- test_cmp pre-receive.expect pre-receive.actual &&
- test_cmp update.expect update.actual &&
- test_cmp post-receive.expect post-receive.actual &&
- test_cmp post-update.expect post-update.actual
+ test_cmp pre-receive.expect pre-receive.actual &&
+ test_cmp update.expect update.actual &&
+ test_cmp post-receive.expect post-receive.actual &&
+ test_cmp post-update.expect post-update.actual
)
'
--
1.7.7.rc3.4.g8d714
^ permalink raw reply related
* Re: [PATCH/RFCv3 2/2] receive-pack: don't pass non-existent refs to post-{receive,update} hooks in push deletions
From: Pang Yan Han @ 2011-09-28 23:08 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, Sitaram Chamarty, Shawn O. Pearce, Jeff King,
Johannes Schindelin
In-Reply-To: <7vsjngxphv.fsf@alter.siamese.dyndns.org>
On Wed, Sep 28, 2011 at 03:37:32PM -0700, Junio C Hamano wrote:
> Pang Yan Han <pangyanhan@gmail.com> writes:
>
> > +/* For invalid refs */
> > +static struct command **invalid_delete;
> > +static size_t invalid_delete_nr;
> > +static size_t invalid_delete_alloc;
>
> Do you have to have these separately only to keep track of the corner case
> errors? I would have expected that it would be more natural to mark them
> by adding a single bitfield to "struct command".
Yes they are purely for keeping track of deleting non-existent refs.
Ok I will add the bitfield to struct command.
>
> > @@ -447,6 +467,8 @@ static const char *update(struct command *cmd)
> > if (!parse_object(old_sha1)) {
> > rp_warning("Allowing deletion of corrupt ref.");
> > old_sha1 = NULL;
> > + if (!ref_exists((char *) name))
> > + invalid_delete_append(cmd);
>
> This is not an "invalid" delete but deleting a non-existing ref. Perhaps
> you would want to move the warning and optionally reword it as well, e.g.
>
> if (!parse_object(old_sha1)) {
> old_sha1 = NULL;
> if (ref_exists(name)) {
> rp_warning("Allowing deletion of corrupt ref.");
> } else {
> rp_warning("Deleting a ref that does not exist.");
> cmd->did_not_exist = 1;
> }
> ...
Sure thing.
I am unable to reply this until much later, but are the tests in this patch ok?
It's the first time I'm writing test cases for git.
Thanks for the feedback!
^ permalink raw reply
* Re: [PATCH/RFCv3 2/2] receive-pack: don't pass non-existent refs to post-{receive,update} hooks in push deletions
From: Pang Yan Han @ 2011-09-28 23:11 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, Sitaram Chamarty, Shawn O. Pearce, Jeff King,
Johannes Schindelin
In-Reply-To: <7voby4xo18.fsf@alter.siamese.dyndns.org>
Sorry Junio, disregard my earlier message. I didn't see this email.
On Wed, Sep 28, 2011 at 04:09:07PM -0700, Junio C Hamano wrote:
> Here is a quick fix-up on top.
>
> Points of interest:
>
> - Get rid of an unnecessary list; just a single bitfield is sufficient;
> while at it, make skip_update also a bitfield.
>
> - ref_exists(refname) calls resolve_ref() which wants "const char *",
> and does not touch refname itself. Make it to take "const char *",
> instead of casting const away in the new caller.
>
> - Properly indent test script. Also:
>
> - Use "cmd <<-EOF" to allow indenting the here text with leading tabs;
>
> - Strip useless use of single quote around "cmd <<'EOF'" inside
> test_expect_success; they were only stepping out of the sq context
> while the third argument to test_expect_success was being parsed. A
> construct like this:
>
> cat >pre-receive.expect <<'EOF'
> $orgmaster $newmaster refs/heads/master
> EOF
>
> only misleads the readers as if the file is getting literal variable
> names without substitution, which is not what is going on.
>
> I didn't fix this in this fix-up patch, but you also need to test the case
> where _ONLY_ deletion of a non-existent ref is requested, in which case,
> pre-receive and update should be told about it, but post-receive and
> post-update should not be even run (i.e. test the absense of these
> post-*.actual files).
>
> builtin/receive-pack.c | 41 ++++++---------------
> refs.c | 2 +-
> refs.h | 2 +-
> t/t5516-fetch-push.sh | 90 ++++++++++++++++++++++++-----------------------
> 4 files changed, 60 insertions(+), 75 deletions(-)
>
> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index 8a0a9d2..b73f18a 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -147,32 +147,13 @@ static void write_head_info(void)
> struct command {
> struct command *next;
> const char *error_string;
> - unsigned int skip_update;
> + unsigned int skip_update:1,
> + did_not_exist:1;
> unsigned char old_sha1[20];
> unsigned char new_sha1[20];
> char ref_name[FLEX_ARRAY]; /* more */
> };
>
> -/* For invalid refs */
> -static struct command **invalid_delete;
> -static size_t invalid_delete_nr;
> -static size_t invalid_delete_alloc;
> -
> -static void invalid_delete_append(struct command *cmd)
> -{
> - ALLOC_GROW(invalid_delete, invalid_delete_nr + 1, invalid_delete_alloc);
> - invalid_delete[invalid_delete_nr++] = cmd;
> -}
> -
> -static int is_invalid_delete(struct command *cmd)
> -{
> - size_t i;
> - for (i = 0; i < invalid_delete_nr; i++)
> - if (invalid_delete[i] == cmd)
> - return 1;
> - return 0;
> -}
> -
> static const char pre_receive_hook[] = "hooks/pre-receive";
> static const char post_receive_hook[] = "hooks/post-receive";
>
> @@ -235,7 +216,7 @@ static int run_receive_hook(struct command *commands, const char *hook_name)
> int have_input = 0, code;
>
> for (cmd = commands; !have_input && cmd; cmd = cmd->next) {
> - if (!cmd->error_string && !is_invalid_delete(cmd))
> + if (!cmd->error_string && !cmd->did_not_exist)
> have_input = 1;
> }
>
> @@ -268,7 +249,7 @@ static int run_receive_hook(struct command *commands, const char *hook_name)
> }
>
> for (cmd = commands; cmd; cmd = cmd->next) {
> - if (!cmd->error_string && !is_invalid_delete(cmd)) {
> + if (!cmd->error_string && !cmd->did_not_exist) {
> size_t n = snprintf(buf, sizeof(buf), "%s %s %s\n",
> sha1_to_hex(cmd->old_sha1),
> sha1_to_hex(cmd->new_sha1),
> @@ -465,10 +446,13 @@ static const char *update(struct command *cmd)
>
> if (is_null_sha1(new_sha1)) {
> if (!parse_object(old_sha1)) {
> - rp_warning("Allowing deletion of corrupt ref.");
> old_sha1 = NULL;
> - if (!ref_exists((char *) name))
> - invalid_delete_append(cmd);
> + if (ref_exists(name)) {
> + rp_warning("Allowing deletion of corrupt ref.");
> + } else {
> + rp_warning("Deleting a non-existent ref.");
> + cmd->did_not_exist = 1;
> + }
> }
> if (delete_ref(namespaced_name, old_sha1, 0)) {
> rp_error("failed to delete %s", name);
> @@ -499,7 +483,7 @@ static void run_update_post_hook(struct command *commands)
> struct child_process proc;
>
> for (argc = 0, cmd = commands; cmd; cmd = cmd->next) {
> - if (cmd->error_string || is_invalid_delete(cmd))
> + if (cmd->error_string || cmd->did_not_exist)
> continue;
> argc++;
> }
> @@ -510,7 +494,7 @@ static void run_update_post_hook(struct command *commands)
>
> for (argc = 1, cmd = commands; cmd; cmd = cmd->next) {
> char *p;
> - if (cmd->error_string || is_invalid_delete(cmd))
> + if (cmd->error_string || cmd->did_not_exist)
> continue;
> p = xmalloc(strlen(cmd->ref_name) + 1);
> strcpy(p, cmd->ref_name);
> @@ -888,6 +872,5 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
> }
> if (use_sideband)
> packet_flush(1);
> - free(invalid_delete);
> return 0;
> }
> diff --git a/refs.c b/refs.c
> index a615043..43e9225 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1851,7 +1851,7 @@ int update_ref(const char *action, const char *refname,
> return 0;
> }
>
> -int ref_exists(char *refname)
> +int ref_exists(const char *refname)
> {
> unsigned char sha1[20];
> return !!resolve_ref(refname, sha1, 1, NULL);
> diff --git a/refs.h b/refs.h
> index dfb086e..4431205 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -57,7 +57,7 @@ extern void warn_dangling_symref(FILE *fp, const char *msg_fmt, const char *refn
> */
> extern void add_extra_ref(const char *refname, const unsigned char *sha1, int flags);
> extern void clear_extra_refs(void);
> -extern int ref_exists(char *);
> +extern int ref_exists(const char *);
>
> extern int peel_ref(const char *, unsigned char *);
>
> diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
> index c0d8a0e..47db4b1 100755
> --- a/t/t5516-fetch-push.sh
> +++ b/t/t5516-fetch-push.sh
> @@ -43,33 +43,34 @@ mk_test () {
> mk_test_with_hooks() {
> mk_test "$@" &&
> (
> - cd testrepo &&
> - mkdir .git/hooks &&
> - cd .git/hooks &&
> + cd testrepo &&
> + mkdir .git/hooks &&
> + cd .git/hooks &&
>
> - cat >pre-receive <<'EOF' &&
> -#!/bin/sh
> -cat - >>pre-receive.actual
> -EOF
> + cat >pre-receive <<-'EOF' &&
> + #!/bin/sh
> + cat - >>pre-receive.actual
> + EOF
>
> - cat >update <<'EOF' &&
> -#!/bin/sh
> -printf "%s %s %s\n" "$@" >>update.actual
> -EOF
> - cat >post-receive <<'EOF' &&
> -#!/bin/sh
> -cat - >>post-receive.actual
> -EOF
> + cat >update <<-'EOF' &&
> + #!/bin/sh
> + printf "%s %s %s\n" "$@" >>update.actual
> + EOF
>
> - cat >post-update <<'EOF' &&
> -#!/bin/sh
> -for ref in "$@"
> -do
> - printf "%s\n" "$ref" >>post-update.actual
> -done
> -EOF
> + cat >post-receive <<-'EOF' &&
> + #!/bin/sh
> + cat - >>post-receive.actual
> + EOF
>
> - chmod u+x pre-receive update post-receive post-update
> + cat >post-update <<-'EOF' &&
> + #!/bin/sh
> + for ref in "$@"
> + do
> + printf "%s\n" "$ref" >>post-update.actual
> + done
> + EOF
> +
> + chmod +x pre-receive update post-receive post-update
> )
> }
>
> @@ -599,31 +600,32 @@ test_expect_success 'pushing valid refs triggers post-receive and post-update ho
> orgnext=$(cd testrepo && git show-ref -s --verify refs/heads/next) &&
> newnext=$_z40 &&
> git push testrepo refs/heads/master:refs/heads/master :refs/heads/next &&
> - (cd testrepo/.git &&
> - cat >pre-receive.expect <<'EOF' &&
> -$orgmaster $newmaster refs/heads/master
> -$orgnext $newnext refs/heads/next
> -EOF
> + (
> + cd testrepo/.git &&
> + cat >pre-receive.expect <<-EOF &&
> + $orgmaster $newmaster refs/heads/master
> + $orgnext $newnext refs/heads/next
> + EOF
>
> - cat >update.expect <<'EOF' &&
> -refs/heads/master $orgmaster $newmaster
> -refs/heads/next $orgnext $newnext
> -EOF
> + cat >update.expect <<-EOF &&
> + refs/heads/master $orgmaster $newmaster
> + refs/heads/next $orgnext $newnext
> + EOF
>
> - cat >post-receive.expect <<'EOF' &&
> -$orgmaster $newmaster refs/heads/master
> -$orgnext $newnext refs/heads/next
> -EOF
> + cat >post-receive.expect <<-EOF &&
> + $orgmaster $newmaster refs/heads/master
> + $orgnext $newnext refs/heads/next
> + EOF
>
> - cat >post-update.expect <<'EOF' &&
> -refs/heads/master
> -refs/heads/next
> -EOF
> + cat >post-update.expect <<-EOF &&
> + refs/heads/master
> + refs/heads/next
> + EOF
>
> - test_cmp pre-receive.expect pre-receive.actual &&
> - test_cmp update.expect update.actual &&
> - test_cmp post-receive.expect post-receive.actual &&
> - test_cmp post-update.expect post-update.actual
> + test_cmp pre-receive.expect pre-receive.actual &&
> + test_cmp update.expect update.actual &&
> + test_cmp post-receive.expect post-receive.actual &&
> + test_cmp post-update.expect post-update.actual
> )
> '
>
> --
> 1.7.7.rc3.4.g8d714
>
^ permalink raw reply
* Re: Lack of detached signatures
From: Joseph Parmelee @ 2011-09-28 22:40 UTC (permalink / raw)
To: Junio C Hamano
Cc: Carlos Martín Nieto, Olsen, Alan R, Michael Witten,
git@vger.kernel.org
In-Reply-To: <7v1uv01uqm.fsf@alter.siamese.dyndns.org>
On Wed, 28 Sep 2011, Junio C Hamano wrote:
> Joseph Parmelee <jparmele@wildbear.com> writes:
>
>> There is confusion here between the repository and the tarball. Once you
>> have produced the tarball there is NO cryptographic protection against
>> forgeries unless you sign it with GPG.
>
> True.
>
> If I give you a URL http://code.google.com/p/git-core/downloads/list with
> checksums
>
> $ sha1sum git-1.7.7.rc3.tar.gz
> c6ba05a833cab49dd66dd1e252306e187effbf2b git-1.7.7.rc3.tar.gz
>
> You either have to trust that code.google.com/ is not broken, or this
> message is coming from real Junio (provided if you can trust him in the
> first place).
>
How do I know that I am actually connected to code.google.com and not some
other site served up to me by a bogus proxy somewhere?
> BUT.
>
> The world is not so blank-and-white. Trust is ultimately among humans. If
> this message is not from the real Junio, don't you think you will hear
> something like "No, that c6ba05... is forgery, please don't use it!" from
> him, when he finds this message on the Git mailing list? If he does not
> exercise diligence to even do that much, does he deserve your trust in the
> first place?
>
> GPG does add security (if you have the key) but you can do pretty well
> even without it in practice.
>
Nonsense. There is a reason why responsible sites everywhere use detached
signatures on their release tarballs.
>> It is only because kernel.org exercised due diligence in the production of
>> tags and signatures on all their tarballs that the kernel code itself
>> withstood their recent intrusion....
>
> I do not think that is true at all. Developers just dropped *.tar.gz on a
> 'master' machine, and left the rest to a cron job that reflates the
> tarball into *.tar.bz2, sign both using a GPG key, and mirror them to the
> public-facing machines 'www'.
>
> Somebody who had access to the 'master' machine could add a new tarball
> and have it go thru the same exact process, getting signed by the cron.
>
The "cron job" provided the passphrase for the signing as well instead of
requiring a human to authorize the transaction by providing the passphrase
or by some other means? I suspect not. Have you actually used GPG to sign
something?
And even if that egregious error (no human authorization) had been made,
there is the matter of the secret signing key. Of course if the bad guys
have that (and the passphrase) then they have everything and can readily
prepare fraudulent packages. But without a detached signature you are
allowing them to do it even without going to the bother of stealing the
secret key and breaking/stealing the passphrase. Without a dual key
signature, you are providing ABSOLUTELY NO protection against
man-in-the-middle attacks that you will never know occurred, but which will
nevertheless (rightfully) reflect on your project. To just assert that "you
can do pretty well even without it in practice" is just plain irresponsible
and convinces me not to update our copies of git until it returns to
kernel.org and to administrators that understand the situation.
^ permalink raw reply
* Re: [PATCH/RFCv3 2/2] receive-pack: don't pass non-existent refs to post-{receive,update} hooks in push deletions
From: Junio C Hamano @ 2011-09-28 23:28 UTC (permalink / raw)
To: Pang Yan Han
Cc: Junio C Hamano, git, Sitaram Chamarty, Shawn O. Pearce, Jeff King,
Johannes Schindelin
In-Reply-To: <20110928230809.GA1798@myhost>
Pang Yan Han <pangyanhan@gmail.com> writes:
> I am unable to reply this until much later, but are the tests in this patch ok?
> It's the first time I'm writing test cases for git.
I'll squash in a bit more updates and later publish the result.
Thanks.
^ permalink raw reply
* Re: [PATCH/RFCv3 2/2] receive-pack: don't pass non-existent refs to post-{receive,update} hooks in push deletions
From: Junio C Hamano @ 2011-09-28 22:37 UTC (permalink / raw)
To: Pang Yan Han
Cc: git, Sitaram Chamarty, Shawn O. Pearce, Jeff King,
Johannes Schindelin
In-Reply-To: <20110928153935.GA7800@myhost>
Pang Yan Han <pangyanhan@gmail.com> writes:
> +/* For invalid refs */
> +static struct command **invalid_delete;
> +static size_t invalid_delete_nr;
> +static size_t invalid_delete_alloc;
Do you have to have these separately only to keep track of the corner case
errors? I would have expected that it would be more natural to mark them
by adding a single bitfield to "struct command".
> @@ -447,6 +467,8 @@ static const char *update(struct command *cmd)
> if (!parse_object(old_sha1)) {
> rp_warning("Allowing deletion of corrupt ref.");
> old_sha1 = NULL;
> + if (!ref_exists((char *) name))
> + invalid_delete_append(cmd);
This is not an "invalid" delete but deleting a non-existing ref. Perhaps
you would want to move the warning and optionally reword it as well, e.g.
if (!parse_object(old_sha1)) {
old_sha1 = NULL;
if (ref_exists(name)) {
rp_warning("Allowing deletion of corrupt ref.");
} else {
rp_warning("Deleting a ref that does not exist.");
cmd->did_not_exist = 1;
}
...
^ permalink raw reply
* Re: Lack of detached signatures
From: Ted Ts'o @ 2011-09-28 23:09 UTC (permalink / raw)
To: Jeff King
Cc: Junio C Hamano, Joseph Parmelee, Carlos Martín Nieto,
Olsen, Alan R, Michael Witten, git@vger.kernel.org
In-Reply-To: <20110928222542.GA18120@sigill.intra.peff.net>
On Wed, Sep 28, 2011 at 06:25:43PM -0400, Jeff King wrote:
> [1] This is a minor nit, and probably not worth breaking away from the
> way the rest of the world does it, but it is somewhat silly to sign the
> compressed data. I couldn't care less about the exact bytes in the
> compressed version; what I care about is the actual tar file. The
> compression is just a transport.
The worry I have is that many users don't check the GPG checksum files
as it is. If they have to decompress the file, and then run gpg to
check the checksum, they might never get around to doing it.
That being said, I'm not sure I have a good solution. One is to ship
the file without using detached signatures, and ship a foo.tar.gz.gpg
file, and force them to use GPG to unwrap the file before it can be
unpacked. But users would yell and scream if we did that...
- Ted
^ permalink raw reply
* Re: Lack of detached signatures
From: Junio C Hamano @ 2011-09-29 0:28 UTC (permalink / raw)
To: Ted Ts'o
Cc: Jeff King, Joseph Parmelee, Carlos Martín Nieto,
Olsen, Alan R, Michael Witten, git@vger.kernel.org
In-Reply-To: <20110928230958.GJ19250@thunk.org>
Ted Ts'o <tytso@mit.edu> writes:
> On Wed, Sep 28, 2011 at 06:25:43PM -0400, Jeff King wrote:
>> [1] This is a minor nit, and probably not worth breaking away from the
>> way the rest of the world does it, but it is somewhat silly to sign the
>> compressed data. I couldn't care less about the exact bytes in the
>> compressed version; what I care about is the actual tar file. The
>> compression is just a transport.
>
> The worry I have is that many users don't check the GPG checksum files
> as it is. If they have to decompress the file, and then run gpg to
> check the checksum, they might never get around to doing it.
>
> That being said, I'm not sure I have a good solution. One is to ship
> the file without using detached signatures, and ship a foo.tar.gz.gpg
> file, and force them to use GPG to unwrap the file before it can be
> unpacked. But users would yell and scream if we did that...
I suspect that letting GPG do the compression and shipping foo.tar.gpg
would work just fine as well, and it is somewhat a tempting response to a
_demand_ to sign materials we distribute. Of course, a nicer response to a
_request_ would be to give a detached signature ;-)
I understand that the automated GPG signature k.org used to use on the
master machine was primarily to protect the copies that the mirrors serve
from getting tampered after they leave the master machine. Do you happen
to know what the new policy will be? Will the developers who distribute
their snapshot tarballs from the site be GPG signing them themselves
before uploading? That would improve the situation (I suspect that there
were some people who misunderstood that these GPG signature were to
protect against break-in at the master machine), but at the same time, it
may create the chicken-and-egg bootstrapping problem if public keys of too
many people need to be published securely.
^ permalink raw reply
* Re: Git is not scalable with too many refs/*
From: Julian Phillips @ 2011-09-29 0:54 UTC (permalink / raw)
To: Martin Fick; +Cc: Christian Couder, git, Christian Couder, Thomas Rast
In-Reply-To: <201109281610.49322.mfick@codeaurora.org>
On Wed, 28 Sep 2011 16:10:48 -0600, Martin Fick wrote:
> On Wednesday, September 28, 2011 01:38:04 pm Martin Fick
> wrote:
-- snip --
>> So with that bug fixed, the thing taking the most time
>> now for a git checkout with ~100K refs seems to be the
>> orphan check as Thomas predicted. The strange part with
>> this, is that the orphan check seems to take only about
>> ~20s in the repo where the refs aren't packed. However,
>> in the repo where they are packed, this check takes at
>> least 5min! This seems a bit unusual, doesn't it? Is
>> the filesystem that much better at indexing refs than
>> git's pack mechanism? Seems unlikely, the unpacked refs
>> take 312M in the FS, the packed ones only take about
>> 4.3M. I suspect their is something else unexpected
>> going on here in the packed ref case.
>>
>> Any thoughts? I will dig deeper...
>
> I think the problem is that resolve_ref() walks a linked
> list of searching for the packed ref. Does this mean that
> packed refs are not indexed at all?
Are you sure that it is walking the linked list that is the problem?
I've created a test repo with ~100k refs/changes/... style refs, and
~40000 refs/heads/... style refs, and checkout can walk the list of
~140k refs seven times in 85ms user time including doing whatever other
processing is needed for checkout. The real time is only 114ms - but
then my test repo has no real data in.
If resolve_ref() walking the linked list of refs was the problem, then
I would expect my test repo to show the same problem. It doesn't, a pre
ref-packing checkout took minutes (~0.5s user time), whereas a
ref-packed checkout takes ~0.1s. So, I would suggest that the problem
lies elsewhere.
Have you tried running a checkout whilst profiling?
--
Julian
^ permalink raw reply
* What's cooking in git.git (Sep 2011, #08; Wed, 28)
From: Junio C Hamano @ 2011-09-29 1:18 UTC (permalink / raw)
To: git
Here are the topics that have been cooking. Commits prefixed with '-' are
only in 'pu' while commits prefixed with '+' are in 'next'.
Here are the repositories that have my integration branches:
With maint, master, next, pu, todo, html and man:
url = git://repo.or.cz/alt-git.git
url = https://code.google.com/p/git-core/
url = https://github.com/git/git
With only maint, master, html and man:
url = git://git.sourceforge.jp/gitroot/git-core/git.git
url = git://git-core.git.sourceforge.net/gitroot/git-core/git-core
With all the topics and integration branches but without todo, html or man:
url = https://github.com/gitster/git
Until kernel.org comes back to life, it may be a good idea to tentatively
have the following in your $HOME/.gitconfig:
[url "http://code.google.com/p/git-core"]
insteadOf = git://git.kernel.org/pub/scm/git/git.git
You can use git://repo.or.cz/ or https://github.com/git/git as your
replacement site as well.
--------------------------------------------------
[New Topics]
* jc/apply-blank-at-eof-fix (2011-09-26) 1 commit
- apply --whitespace=error: correctly report new blank lines at end
* nd/sparse-doc (2011-09-26) 1 commit
- git-read-tree.txt: update sparse checkout examples
* jp/get-ref-dir-unsorted (2011-09-26) 1 commit
- Don't sort ref_list too early
* jc/grep-untracked-exclude (2011-09-28) 2 commits
- Merge branch 'jc/maint-grep-untracked-exclude' into jc/grep-untracked-exclude
- Merge branch 'jc/maint-grep-untracked-exclude' into jc/grep-untracked-exclude
(this branch uses bw/grep-no-index-no-exclude and jc/maint-grep-untracked-exclude.)
* jc/maint-grep-untracked-exclude (2011-09-28) 3 commits
- grep: rename --exclude to --exclude-standard
- grep: --untracked and --exclude tests
- grep: teach --untracked and --exclude options
(this branch is used by jc/grep-untracked-exclude; uses bw/grep-no-index-no-exclude.)
* jc/parse-options-boolean (2011-09-28) 5 commits
- apply: use OPT_NOOP_NOARG
- revert: use OPT_NOOP_NOARG
- parseopt: add OPT_NOOP_NOARG
- archive.c: use OPT_BOOL()
- parse-options: deprecate OPT_BOOLEAN
* mh/maint-notes-merge-pathbuf-fix (2011-09-27) 1 commit
- notes_merge_commit(): do not pass temporary buffer to other function
* ph/push-to-delete-nothing (2011-09-28) 2 commits
- fixup
- receive-pack: don't pass non-existent refs to post-{receive,update} hooks in push deletions
* ps/gitweb-js-with-lineno (2011-09-27) 1 commit
- gitweb: Fix links to lines in blobs when javascript-actions are enabled
* zj/send-email-authen-sasl (2011-09-28) 1 commit
- send-email: auth plain/login fix
--------------------------------------------------
[Graduated to "master"]
* jc/namespace-doc-with-old-asciidoc (2011-09-16) 1 commit
+ Documentation/gitnamespaces.txt: cater to older asciidoc
It turns out that the version of AsciiDoc that has troubles with the
mark-up this patch works around is not quite old enough to be dismissed
as irrelevant.
--------------------------------------------------
[Stalled]
* hv/submodule-merge-search (2011-08-26) 5 commits
- submodule: Search for merges only at end of recursive merge
- allow multiple calls to submodule merge search for the same path
- submodule: Demonstrate known breakage during recursive merge
- push: Don't push a repository with unpushed submodules
(merged to 'next' on 2011-08-24 at 398e764)
+ push: teach --recurse-submodules the on-demand option
(this branch is tangled with fg/submodule-auto-push.)
The second from the bottom one needs to be replaced with a properly
written commit log message.
* jc/signed-push (2011-09-12) 7 commits
- push -s: support pre-receive-signature hook
- push -s: receiving end
- push -s: send signed push certificate
- push -s: skeleton
- Split GPG interface into its own helper library
- rename "match_refs()" to "match_push_refs()"
- send-pack: typofix error message
(this branch uses jc/run-receive-hook-cleanup; is tangled with jc/signed-push-3.)
This was the v2 that updated notes tree on the receiving end.
* jc/signed-push-3 (2011-09-12) 4 commits
. push -s: signed push
- Split GPG interface into its own helper library
- rename "match_refs()" to "match_push_refs()"
- send-pack: typofix error message
(this branch uses jc/run-receive-hook-cleanup; is tangled with jc/signed-push.)
This is the third edition, that moves the preparation of the notes tree to
the sending end.
I expect that both of these topics will be discarded.
* jk/add-i-hunk-filter (2011-07-27) 5 commits
(merged to 'next' on 2011-08-11 at 8ff9a56)
+ add--interactive: add option to autosplit hunks
+ add--interactive: allow negatation of hunk filters
+ add--interactive: allow hunk filtering on command line
+ add--interactive: factor out regex error handling
+ add--interactive: refactor patch mode argument processing
Will be dropped.
* jh/receive-count-limit (2011-05-23) 10 commits
- receive-pack: Allow server to refuse pushes with too many objects
- pack-objects: Estimate pack size; abort early if pack size limit is exceeded
- send-pack/receive-pack: Allow server to refuse pushing too large packs
- pack-objects: Allow --max-pack-size to be used together with --stdout
- send-pack/receive-pack: Allow server to refuse pushes with too many commits
- pack-objects: Teach new option --max-commit-count, limiting #commits in pack
- receive-pack: Prepare for addition of the new 'limit-*' family of capabilities
- Tighten rules for matching server capabilities in server_supports()
- send-pack: Attempt to retrieve remote status even if pack-objects fails
- Update technical docs to reflect side-band-64k capability in receive-pack
Would need another round to separate per-pack and per-session limits.
* jk/generation-numbers (2011-09-11) 8 commits
- metadata-cache.c: make two functions static
- limit "contains" traversals based on commit generation
- check commit generation cache validity against grafts
- pretty: support %G to show the generation number of a commit
- commit: add commit_generation function
- add metadata-cache infrastructure
- decorate: allow storing values instead of pointers
- Merge branch 'jk/tag-contains-ab' (early part) into HEAD
The initial "tag --contains" de-pessimization without need for generation
numbers is already in; backburnered.
* sr/transport-helper-fix-rfc (2011-07-19) 2 commits
- t5800: point out that deleting branches does not work
- t5800: document inability to push new branch with old content
Perhaps 281eee4 (revision: keep track of the end-user input from the
command line, 2011-08-25) in bk/ancestry-path would help.
* po/cygwin-backslash (2011-08-05) 2 commits
- On Cygwin support both UNIX and DOS style path-names
- git-compat-util: add generic find_last_dir_sep that respects is_dir_sep
Incomplete with respect to backslash processing in prefix_filename(), and
also loses the ability to escape glob specials. Perhaps drop?
--------------------------------------------------
[Cooking]
* jc/maint-diffstat-numstat-context (2011-09-22) 1 commit
(merged to 'next' on 2011-09-26 at 12539ab)
+ diff: teach --stat/--numstat to honor -U$num
"diff" is allowed to match the common lines differently depending on how
many context lines it is showing, so running --(num)stat with 0 lines of
context internally gives a result that may be surprising to some people.
* nd/maint-sparse-errors (2011-09-22) 2 commits
(merged to 'next' on 2011-09-26 at cdcdec5)
+ Add explanation why we do not allow to sparse checkout to empty working tree
+ sparse checkout: show error messages when worktree shaping fails
* rs/diff-cleanup-records-fix (2011-09-26) 1 commit
(merged to 'next' on 2011-09-27 at 3bd75d8)
+ Revert removal of multi-match discard heuristic in 27af01
* di/fast-import-empty-tag-note-fix (2011-09-22) 2 commits
- fast-import: don't allow to note on empty branch
- fast-import: don't allow to tag empty branch
Looked reasonable.
* js/check-attr-cached (2011-09-22) 2 commits
(merged to 'next' on 2011-09-27 at 74d7b66)
+ t0003: remove extra whitespaces
+ Teach '--cached' option to check-attr
* bw/grep-no-index-no-exclude (2011-09-15) 2 commits
(merged to 'next' on 2011-09-26 at 776f13b)
+ grep --no-index: don't use git standard exclusions
+ grep: do not use --index in the short usage output
(this branch is used by jc/grep-untracked-exclude and jc/maint-grep-untracked-exclude.)
* jc/want-commit (2011-09-15) 1 commit
(merged to 'next' on 2011-09-26 at 5841512)
+ Allow git merge ":/<pattern>"
* jc/ls-remote-short-help (2011-09-16) 1 commit
(merged to 'next' on 2011-09-26 at e24a27a)
+ ls-remote: a lone "-h" is asking for help
* jc/maint-bundle-too-quiet (2011-09-19) 1 commit
(merged to 'next' on 2011-09-26 at ba140d4)
+ Teach progress eye-candy to fetch_refs_from_bundle()
* jk/filter-branch-require-clean-work-tree (2011-09-15) 1 commit
(merged to 'next' on 2011-09-26 at 206a74a)
+ filter-branch: use require_clean_work_tree
* jn/gitweb-highlite-sanitise (2011-09-16) 1 commit
(merged to 'next' on 2011-09-26 at c79390a)
+ gitweb: Strip non-printable characters from syntax highlighter output
* mh/check-ref-format-3 (2011-09-16) 22 commits
- add_ref(): verify that the refname is formatted correctly
- resolve_ref(): expand documentation
- resolve_ref(): also treat a too-long SHA1 as invalid
- resolve_ref(): emit warnings for improperly-formatted references
- resolve_ref(): verify that the input refname has the right format
- remote: avoid passing NULL to read_ref()
- remote: use xstrdup() instead of strdup()
- resolve_ref(): do not follow incorrectly-formatted symbolic refs
- resolve_ref(): extract a function get_packed_ref()
- resolve_ref(): turn buffer into a proper string as soon as possible
- resolve_ref(): only follow a symlink that contains a valid, normalized refname
- resolve_ref(): use prefixcmp()
- resolve_ref(): explicitly fail if a symlink is not readable
- Change check_refname_format() to reject unnormalized refnames
- Inline function refname_format_print()
- Make collapse_slashes() allocate memory for its result
- Do not allow ".lock" at the end of any refname component
- Refactor check_refname_format()
- Change check_ref_format() to take a flags argument
- Change bad_ref_char() to return a boolean value
- git check-ref-format: add options --allow-onelevel and --refspec-pattern
- t1402: add some more tests
* cn/eradicate-working-copy (2011-09-21) 1 commit
(merged to 'next' on 2011-09-26 at 2683d36)
+ Remove 'working copy' from the documentation and C code
* js/bisect-no-checkout (2011-09-21) 1 commit
(merged to 'next' on 2011-09-21 at e94ad3e)
+ bisect: fix exiting when checkout failed in bisect_start()
* mg/maint-doc-sparse-checkout (2011-09-21) 3 commits
(merged to 'next' on 2011-09-21 at f316dec)
+ git-read-tree.txt: correct sparse-checkout and skip-worktree description
+ git-read-tree.txt: language and typography fixes
+ unpack-trees: print "Aborting" to stderr
* ms/patch-id-with-overlong-line (2011-09-22) 1 commit
(merged to 'next' on 2011-09-26 at a33d0b2)
+ patch-id.c: use strbuf instead of a fixed buffer
* sn/doc-update-index-assume-unchanged (2011-09-21) 1 commit
(merged to 'next' on 2011-09-21 at 325e796)
+ Documentation/git-update-index: refer to 'ls-files'
* jc/request-pull-show-head-4 (2011-09-21) 7 commits
- request-pull: use the branch description
- request-pull: state what commit to expect
- request-pull: modernize style
- branch: teach --edit-description option
- format-patch: use branch description in cover letter
- branch: add read_branch_desc() helper function
- Merge branch 'bk/ancestry-path' into jc/branch-desc
(this branch uses bk/ancestry-path.)
* jm/mergetool-pathspec (2011-09-26) 2 commits
(merged to 'next' on 2011-09-26 at f699566)
+ mergetool: no longer need to save standard input
+ mergetool: Use args as pathspec to unmerged files
* nd/maint-autofix-tag-in-head (2011-09-18) 4 commits
(merged to 'next' on 2011-09-27 at dc8e2e3)
+ Accept tags in HEAD or MERGE_HEAD
+ merge: remove global variable head[]
+ merge: use return value of resolve_ref() to determine if HEAD is invalid
+ merge: keep stash[] a local variable
* jk/maint-fetch-submodule-check-fix (2011-09-12) 1 commit
(merged to 'next' on 2011-09-12 at 3c73b8c)
+ fetch: avoid quadratic loop checking for updated submodules
(this branch is used by jk/argv-array.)
This probably can wait, as long as the other half of the regression fix
is in the upcoming release.
* bc/attr-ignore-case (2011-09-14) 5 commits
(merged to 'next' on 2011-09-26 at 1e0814c)
+ attr: read core.attributesfile from git_default_core_config
+ attr.c: respect core.ignorecase when matching attribute patterns
+ builtin/mv.c: plug miniscule memory leak
+ cleanup: use internal memory allocation wrapper functions everywhere
+ attr.c: avoid inappropriate access to strbuf "buf" member
* jc/maint-fsck-fwrite-size-check (2011-09-11) 1 commit
(merged to 'next' on 2011-09-16 at 2258f11)
+ fsck: do not abort upon finding an empty blob
* jk/argv-array (2011-09-14) 7 commits
(merged to 'next' on 2011-09-16 at 90feab4)
+ run_hook: use argv_array API
+ checkout: use argv_array API
+ bisect: use argv_array API
+ quote: provide sq_dequote_to_argv_array
+ refactor argv_array into generic code
+ quote.h: fix bogus comment
+ add sha1_array API docs
(this branch uses jk/maint-fetch-submodule-check-fix.)
* js/cred-macos-x-keychain-2 (2011-09-14) 1 commit
(merged to 'next' on 2011-09-26 at 4f289a4)
+ contrib: add a pair of credential helpers for Mac OS X's keychain
(this branch uses jk/http-auth-keyring.)
Welcome addition to build our confidence in the jk/http-auth-keyring topic.
* rj/maint-t9159-svn-rev-notation (2011-09-21) 1 commit
(merged to 'next' on 2011-09-26 at 525a567)
+ t9159-*.sh: skip for mergeinfo test for svn <= 1.4
* tr/doc-note-rewrite (2011-09-13) 1 commit
(merged to 'next' on 2011-09-16 at 5fe813a)
+ Documentation: basic configuration of notes.rewriteRef
Updated to a safer wording.
* jk/default-attr (2011-09-12) 1 commit
- attr: map builtin userdiff drivers to well-known extensions
Will be re-rolled after 1.7.7 final.
* hl/iso8601-more-zone-formats (2011-09-12) 1 commit
(merged to 'next' on 2011-09-12 at 270f5c7)
+ date.c: Support iso8601 timezone formats
* jc/run-receive-hook-cleanup (2011-09-12) 1 commit
(merged to 'next' on 2011-09-12 at 68dd431)
+ refactor run_receive_hook()
(this branch is used by jc/signed-push and jc/signed-push-3.)
Just to make it easier to run a hook that reads from its standard input.
* jk/for-each-ref (2011-09-08) 5 commits
(merged to 'next' on 2011-09-14 at 36ed515)
+ for-each-ref: add split message parts to %(contents:*).
+ for-each-ref: handle multiline subjects like --pretty
+ for-each-ref: refactor subject and body placeholder parsing
+ t6300: add more body-parsing tests
+ t7004: factor out gpg setup
* wh/normalize-alt-odb-path (2011-09-07) 1 commit
(merged to 'next' on 2011-09-14 at 96f722b)
+ sha1_file: normalize alt_odb path before comparing and storing
* fk/use-kwset-pickaxe-grep-f (2011-09-11) 2 commits
(merged to 'next' on 2011-09-14 at 436d858)
+ obstack.c: Fix some sparse warnings
+ sparse: Fix an "Using plain integer as NULL pointer" warning
In general we would prefer to see these fixed at the upstream first, but
we have essentially forked from them at their last GPLv2 versions...
* jc/make-static (2011-09-14) 4 commits
(merged to 'next' on 2011-09-14 at c5943ff)
+ exec_cmd.c: prepare_git_cmd() is sometimes used
+ environment.c: have_git_dir() has users on Cygwin
(merged to 'next' on 2011-09-11 at 2acb0af)
+ vcs-svn: remove unused functions and make some static
+ make-static: master
With a few fix-ups; probably needs to be ejected after 1.7.7 happens.
* rj/quietly-create-dep-dir (2011-09-11) 1 commit
(merged to 'next' on 2011-09-12 at 93d1c6b)
+ Makefile: Make dependency directory creation less noisy
* mz/remote-rename (2011-09-11) 4 commits
(merged to 'next' on 2011-09-26 at 5e64f68)
+ remote: only update remote-tracking branch if updating refspec
+ remote rename: warn when refspec was not updated
+ remote: "rename o foo" should not rename ref "origin/bar"
+ remote: write correct fetch spec when renaming remote 'remote'
* cb/common-prefix-unification (2011-09-12) 3 commits
(merged to 'next' on 2011-09-14 at 24f571f)
+ rename pathspec_prefix() to common_prefix() and move to dir.[ch]
+ consolidate pathspec_prefix and common_prefix
+ remove prefix argument from pathspec_prefix
* cb/send-email-help (2011-09-12) 1 commit
(merged to 'next' on 2011-09-14 at ae71999)
+ send-email: add option -h
A separate set of patches to remove the hidden fully-spelled "help" from
other commands would be nice to have as companion patches as well.
* jc/fetch-pack-fsck-objects (2011-09-04) 3 commits
(merged to 'next' on 2011-09-12 at a031347)
+ test: fetch/receive with fsckobjects
+ transfer.fsckobjects: unify fetch/receive.fsckobjects
+ fetch.fsckobjects: verify downloaded objects
We had an option to verify the sent objects before accepting a push but
lacked the corresponding option when fetching. In the light of the recent
k.org incident, a change like this would be a good addition.
* jc/fetch-verify (2011-09-01) 3 commits
(merged to 'next' on 2011-09-12 at 3f491ab)
+ fetch: verify we have everything we need before updating our ref
+ rev-list --verify-object
+ list-objects: pass callback data to show_objects()
(this branch uses jc/traverse-commit-list; is tangled with jc/receive-verify.)
During a fetch, we verify that the pack stream is self consistent,
but did not verify that the refs that are updated are consistent with
objects contained in the packstream, and this adds such a check.
* jc/receive-verify (2011-09-09) 6 commits
(merged to 'next' on 2011-09-12 at 856de78)
+ receive-pack: check connectivity before concluding "git push"
+ check_everything_connected(): libify
+ check_everything_connected(): refactor to use an iterator
+ fetch: verify we have everything we need before updating our ref
+ rev-list --verify-object
+ list-objects: pass callback data to show_objects()
(this branch uses jc/traverse-commit-list; is tangled with jc/fetch-verify.)
While accepting a push, we verify that the pack stream is self consistent,
but did not verify that the refs the push updates are consistent with
objects contained in the packstream, and this adds such a check.
* jn/maint-http-error-message (2011-09-06) 2 commits
(merged to 'next' on 2011-09-12 at a843f03)
+ http: avoid empty error messages for some curl errors
+ http: remove extra newline in error message
* bk/ancestry-path (2011-09-15) 4 commits
(merged to 'next' on 2011-09-15 at aa64d04)
+ t6019: avoid refname collision on case-insensitive systems
(merged to 'next' on 2011-09-02 at d05ba5d)
+ revision: do not include sibling history in --ancestry-path output
+ revision: keep track of the end-user input from the command line
+ rev-list: Demonstrate breakage with --ancestry-path --all
(this branch is used by jc/request-pull-show-head-4.)
* mg/branch-list (2011-09-13) 7 commits
(merged to 'next' on 2011-09-14 at 6610a2e)
+ t3200: clean up checks for file existence
(merged to 'next' on 2011-09-11 at 20a9cdb)
+ branch: -v does not automatically imply --list
(merged to 'next' on 2011-09-02 at b818eae)
+ branch: allow pattern arguments
+ branch: introduce --list option
+ git-branch: introduce missing long forms for the options
+ git-tag: introduce long forms for the options
+ t6040: test branch -vv
* mm/rebase-i-exec-edit (2011-08-26) 2 commits
(merged to 'next' on 2011-09-02 at e75b1b9)
+ rebase -i: notice and warn if "exec $cmd" modifies the index or the working tree
+ rebase -i: clean error message for --continue after failed exec
* mm/mediawiki-as-a-remote (2011-09-28) 6 commits
(merged to 'next' on 2011-09-28 at a1c9ae5)
+ git-remote-mediawiki: allow a domain to be set for authentication
(merged to 'next' on 2011-09-27 at 7ce8254)
+ git-remote-mediawiki: obey advice.pushNonFastForward
+ git-remote-mediawiki: set 'basetimestamp' to let the wiki handle conflicts
+ git-remote-mediawiki: trivial fixes
(merged to 'next' on 2011-09-12 at 163c6a5)
+ git-remote-mediawiki: allow push to set MediaWiki metadata
+ Add a remote helper to interact with mediawiki (fetch & push)
Fun.
* bc/unstash-clean-crufts (2011-08-27) 4 commits
(merged to 'next' on 2011-09-02 at 7bfd66f)
+ git-stash: remove untracked/ignored directories when stashed
+ t/t3905: add missing '&&' linkage
+ git-stash.sh: fix typo in error message
+ t/t3905: use the name 'actual' for test output, swap arguments to test_cmp
* da/make-auto-header-dependencies (2011-08-30) 1 commit
(merged to 'next' on 2011-09-02 at e04a4af)
+ Makefile: Improve compiler header dependency check
(this branch uses fk/make-auto-header-dependencies.)
* gb/am-hg-patch (2011-08-29) 1 commit
(merged to 'next' on 2011-09-02 at 3edfe4c)
+ am: preliminary support for hg patches
* jc/diff-index-unpack (2011-08-29) 3 commits
(merged to 'next' on 2011-09-02 at 4206bd9)
+ diff-index: pass pathspec down to unpack-trees machinery
+ unpack-trees: allow pruning with pathspec
+ traverse_trees(): allow pruning with pathspec
* nm/grep-object-sha1-lock (2011-08-30) 1 commit
(merged to 'next' on 2011-09-02 at 336f57d)
+ grep: Fix race condition in delta_base_cache
* tr/mergetool-valgrind (2011-08-30) 1 commit
(merged to 'next' on 2011-09-02 at f5f2c61)
+ Symlink mergetools scriptlets into valgrind wrappers
* fg/submodule-auto-push (2011-09-11) 2 commits
(merged to 'next' on 2011-09-11 at 3fc86f7)
+ submodule.c: make two functions static
(merged to 'next' on 2011-08-24 at 398e764)
+ push: teach --recurse-submodules the on-demand option
(this branch is tangled with hv/submodule-merge-search.)
What the topic aims to achieve may make sense, but the implementation
looked somewhat suboptimal.
* jc/traverse-commit-list (2011-08-22) 3 commits
(merged to 'next' on 2011-08-24 at df50dd7)
+ revision.c: update show_object_with_name() without using malloc()
+ revision.c: add show_object_with_name() helper function
+ rev-list: fix finish_object() call
(this branch is used by jc/fetch-verify and jc/receive-verify.)
* fk/make-auto-header-dependencies (2011-08-18) 1 commit
(merged to 'next' on 2011-08-24 at 3da2c25)
+ Makefile: Use computed header dependencies if the compiler supports it
(this branch is used by da/make-auto-header-dependencies.)
* mh/iterate-refs (2011-09-11) 7 commits
(merged to 'next' on 2011-09-27 at c289699)
+ refs.c: make create_cached_refs() static
+ Retain caches of submodule refs
+ Store the submodule name in struct cached_refs
+ Allocate cached_refs objects dynamically
+ Change the signature of read_packed_refs()
+ Access reference caches only through new function get_cached_refs()
+ Extract a function clear_cached_refs()
I did not see anything fundamentally wrong with this series, but it was
unclear what the benefit of these changes are. If the series were to read
parts of the ref hierarchy (like refs/heads/) lazily, the story would
have been different, though.
* hv/submodule-update-none (2011-08-11) 2 commits
(merged to 'next' on 2011-08-24 at 5302fc1)
+ add update 'none' flag to disable update of submodule by default
+ submodule: move update configuration variable further up
* jc/lookup-object-hash (2011-08-11) 6 commits
(merged to 'next' on 2011-08-24 at 5825411)
+ object hash: replace linear probing with 4-way cuckoo hashing
+ object hash: we know the table size is a power of two
+ object hash: next_size() helper for readability
+ pack-objects --count-only
+ object.c: remove duplicated code for object hashing
+ object.c: code movement for readability
I do not think there is anything fundamentally wrong with this series, but
the risk of breakage far outweighs observed performance gain in one
particular workload. Will keep it in 'next' at least for one cycle.
* fg/submodule-git-file-git-dir (2011-08-22) 2 commits
(merged to 'next' on 2011-08-23 at 762194e)
+ Move git-dir for submodules
+ rev-parse: add option --resolve-git-dir <path>
I do not think there is anything fundamentally wrong with this series, but
the risk of breakage outweighs any benefit for having this new
feature. Will keep it in 'next' at least for one cycle.
* jk/http-auth-keyring (2011-09-28) 22 commits
(merged to 'next' on 2011-09-28 at 65ce6c2)
+ credential-cache: don't cache items without context
(merged to 'next' on 2011-09-16 at b4195eb)
+ check_expirations: don't copy over same element
+ t0300: add missing EOF terminator for <<
(merged to 'next' on 2011-09-14 at 589c7c9)
+ credential-store: use a better storage format
+ t0300: make alternate username tests more robust
+ t0300: make askpass tests a little more robust
+ credential-cache: fix expiration calculation corner cases
+ docs: minor tweaks to credentials API
(merged to 'next' on 2011-09-11 at 491ce6a)
+ credentials: make credential_fill_gently() static
(merged to 'next' on 2011-08-03 at b06e80e)
+ credentials: add "getpass" helper
+ credentials: add "store" helper
+ credentials: add "cache" helper
+ docs: end-user documentation for the credential subsystem
+ http: use hostname in credential description
+ allow the user to configure credential helpers
+ look for credentials in config before prompting
+ http: use credential API to get passwords
+ introduce credentials API
+ http: retry authentication failures for all http requests
+ remote-curl: don't retry auth failures with dumb protocol
+ improve httpd auth tests
+ url: decode buffers that are not NUL-terminated
(this branch is tangled with js/cred-macos-x-keychain-2.)
* rr/revert-cherry-pick-continue (2011-09-11) 19 commits
(merged to 'next' on 2011-09-11 at 7d78054)
+ builtin/revert.c: make commit_list_append() static
(merged to 'next' on 2011-08-24 at 712c115)
+ revert: Propagate errors upwards from do_pick_commit
+ revert: Introduce --continue to continue the operation
+ revert: Don't implicitly stomp pending sequencer operation
+ revert: Remove sequencer state when no commits are pending
+ reset: Make reset remove the sequencer state
+ revert: Introduce --reset to remove sequencer state
+ revert: Make pick_commits functionally act on a commit list
+ revert: Save command-line options for continuing operation
+ revert: Save data for continuing after conflict resolution
+ revert: Don't create invalid replay_opts in parse_args
+ revert: Separate cmdline parsing from functional code
+ revert: Introduce struct to keep command-line options
+ revert: Eliminate global "commit" variable
+ revert: Rename no_replay to record_origin
+ revert: Don't check lone argument in get_encoding
+ revert: Simplify and inline add_message_to_msg
+ config: Introduce functions to write non-standard file
+ advice: Introduce error_resolve_conflict
--------------------------------------------------
[Discarded]
* js/cred-macos-x-keychain (2011-09-11) 15 commits
(merged to 'next' on 2011-09-12 at 8d17f94)
+ contrib: add a credential helper for Mac OS X's keychain
(merged to 'next' on 2011-09-11 at 491ce6a)
+ credentials: make credential_fill_gently() static
(merged to 'next' on 2011-08-03 at b06e80e)
+ credentials: add "getpass" helper
+ credentials: add "store" helper
+ credentials: add "cache" helper
+ docs: end-user documentation for the credential subsystem
+ http: use hostname in credential description
+ allow the user to configure credential helpers
+ look for credentials in config before prompting
+ http: use credential API to get passwords
+ introduce credentials API
+ http: retry authentication failures for all http requests
+ remote-curl: don't retry auth failures with dumb protocol
+ improve httpd auth tests
+ url: decode buffers that are not NUL-terminated
(this branch is tangled with jk/http-auth-keyring and js/cred-macos-x-keychain-2.)
Reverted out of 'next'.
* jc/reflog-walk-use-only-nsha1 (2011-09-13) 4 commits
. (baloon) teach reflog-walk to look at only new-sha1 field
+ environment.c: have_git_dir() has users on Cygwin
(merged to 'next' on 2011-09-11 at 2acb0af)
+ vcs-svn: remove unused functions and make some static
+ make-static: master
(this branch is tangled with jc/make-static.)
* hw/maint-abspath-cwd-limit (2011-09-21) 3 commits
(merged to 'next' on 2011-09-21 at 210cf9a)
+ Revert 622fea4 (abspath.c: increase array size of cwd variable)
(merged to 'next' on 2011-09-19 at 7d5e921)
+ abspath.c: increase array size of cwd variable to PATH_MAX
+ path.c: increase array size of cwd variable to PATH_MAX
Reverted out of 'next'.
* jc/request-pull-show-head (2011-09-13) 2 commits
(merged to 'next' on 2011-09-13 at c82fb3a)
+ Revert "State what commit to expect in request-pull"
(merged to 'next' on 2011-09-12 at c1c7b73)
+ State what commit to expect in request-pull
Reverted out of 'next'.
^ permalink raw reply
* Re: Lack of detached signatures
From: Joseph Parmelee @ 2011-09-29 1:29 UTC (permalink / raw)
To: Ted Ts'o
Cc: Jeff King, Junio C Hamano, Carlos Martín Nieto,
Olsen, Alan R, Michael Witten, git@vger.kernel.org
In-Reply-To: <20110928230958.GJ19250@thunk.org>
On Wed, 28 Sep 2011, Ted Ts'o wrote:
> On Wed, Sep 28, 2011 at 06:25:43PM -0400, Jeff King wrote:
>> [1] This is a minor nit, and probably not worth breaking away from the
>> way the rest of the world does it, but it is somewhat silly to sign the
>> compressed data. I couldn't care less about the exact bytes in the
>> compressed version; what I care about is the actual tar file. The
>> compression is just a transport.
>
> The worry I have is that many users don't check the GPG checksum files
> as it is. If they have to decompress the file, and then run gpg to
> check the checksum, they might never get around to doing it.
>
> That being said, I'm not sure I have a good solution. One is to ship
> the file without using detached signatures, and ship a foo.tar.gz.gpg
> file, and force them to use GPG to unwrap the file before it can be
> unpacked. But users would yell and scream if we did that...
>
> - Ted
>
Or you could just provide detached signatures for the compressed tarballs
like they have been doing for years at kernel.org (and many other sites).
If tarball.tar.bz2 has a detached signature tarball.tar.bz2.sig, just
download them both and:
gpg --verify tarball.tar.gz.sig
To argue that some people don't avail themselves of this feature is no
excuse for not providing it for those of us who consider it vital. The
break in at k.o is no excuse for dropping this very sensible policy which
has protected us for years. Just change the signing key and continue as
before.
^ permalink raw reply
* Re: Git is not scalable with too many refs/*
From: Martin Fick @ 2011-09-29 1:37 UTC (permalink / raw)
To: Julian Phillips; +Cc: Christian Couder, git, Christian Couder, Thomas Rast
In-Reply-To: <c76d7f65203c0fc2c6e4e14fe2f33274@quantumfyre.co.uk>
On Wednesday 28 September 2011 18:59:09 Martin Fick wrote:
> Julian Phillips <julian@quantumfyre.co.uk> wrote:
> > On Wed, 28 Sep 2011 16:10:48 -0600, Martin Fick wrote:
> >> So with that bug fixed, the thing taking the most time
> >> now for a git checkout with ~100K refs seems to be the
> >> orphan check as Thomas predicted. The strange part with
> >> this, is that the orphan check seems to take only about
> >> ~20s in the repo where the refs aren't packed. However,
> >> in the repo where they are packed, this check takes at
> >> least 5min! This seems a bit unusual, doesn't it? Is
> >> the filesystem that much better at indexing refs than
> >> git's pack mechanism? Seems unlikely, the unpacked refs
> >> take 312M in the FS, the packed ones only take about
> >> 4.3M. I suspect their is something else unexpected
> >> going on here in the packed ref case.
> >>
> >> Any thoughts? I will dig deeper...
> >
> > I think the problem is that resolve_ref() walks a linked
> > list of searching for the packed ref. Does this mean that
> > packed refs are not indexed at all?
> > Are you sure that it is walking the linked list that is the problem?
It sure seems like it.
> I've created a test repo with ~100k refs/changes/... style refs, and
> ~40000 refs/heads/... style refs, and checkout can walk the list of
> ~140k refs seven times in 85ms user time including doing whatever other
> processing is needed for checkout. The real time is only 114ms - but
> then my test repo has no real data in.
If I understand what you are saying, it sounds like you do not have a very good test case. The amount of time it takes for checkout depends on how long it takes to find a ref with the sha1 that you are on. If that sha1 is so early in the list of refs that it only took you 7 traversals to find it, then that is not a very good testcase. I think that you should probably try making an orphaned ref (checkout a detached head, commit to it), that is probably the worst testcase since it should then have to search all 140K refs to eventually give up.
Again, if I understand what you are saying, if it took 85ms for 7 traversals, then it takes approximately 10ms per traversal, that's only 100/s! If you have to traverse it 140K times, that should work out to 1400s ~ 23mins.
> If resolve_ref() walking the linked list of refs was the problem, then > I would expect my test repo to show the same problem. It doesn't, a pre
> ref-packing checkout took minutes (~0.5s user time), whereas a
> ref-packed checkout takes ~0.1s. So, I would suggest that the problem > lies elsewhere.
>
> Have you tried running a checkout whilst profiling?
No, to be honest, I am not familiar with any profilling tools.
-Martin
Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora Forum
^ permalink raw reply
* Re: Lack of detached signatures
From: Jeff King @ 2011-09-29 1:41 UTC (permalink / raw)
To: Ted Ts'o
Cc: Junio C Hamano, Joseph Parmelee, Carlos Martín Nieto,
Olsen, Alan R, Michael Witten, git@vger.kernel.org
In-Reply-To: <20110928230958.GJ19250@thunk.org>
On Wed, Sep 28, 2011 at 07:09:58PM -0400, Ted Ts'o wrote:
> On Wed, Sep 28, 2011 at 06:25:43PM -0400, Jeff King wrote:
> > [1] This is a minor nit, and probably not worth breaking away from the
> > way the rest of the world does it, but it is somewhat silly to sign the
> > compressed data. I couldn't care less about the exact bytes in the
> > compressed version; what I care about is the actual tar file. The
> > compression is just a transport.
>
> The worry I have is that many users don't check the GPG checksum files
> as it is. If they have to decompress the file, and then run gpg to
> check the checksum, they might never get around to doing it.
It shouldn't really be any more cumbersome. But at the same time, it's
different than the way everyone else does it, so any minor convenience
we get is probably nullified by simply confusing anybody.
I wonder how many people actually check gpg checksums on downloaded
files. I don't usually. But I do expect something like a package manager
building from upstream source (e.g., freebsd-style ports, or distro
packagers pulling a new upstream) to bother to check it.
> That being said, I'm not sure I have a good solution. One is to ship
> the file without using detached signatures, and ship a foo.tar.gz.gpg
> file, and force them to use GPG to unwrap the file before it can be
> unpacked. But users would yell and scream if we did that...
And rightly so. I mentioned the cost of implementing the mechanism
before. If it's just "Junio runs gpg and throws the detached signature
up on the ftp site", it's not a big deal. But if it's "you can't
download and install git until you have gpg installed", that is raising
the bar quite a bit.
It should be the recipient's decision how much they want to trust the
data. We would just be helping them out by providing more information.
-Peff
^ permalink raw reply
* Re: Lack of detached signatures
From: Ted Ts'o @ 2011-09-29 1:59 UTC (permalink / raw)
To: Junio C Hamano
Cc: Jeff King, Joseph Parmelee, Carlos Martín Nieto,
Olsen, Alan R, Michael Witten, git@vger.kernel.org
In-Reply-To: <7vd3ekxkca.fsf@alter.siamese.dyndns.org>
On Wed, Sep 28, 2011 at 05:28:53PM -0700, Junio C Hamano wrote:
>
> I suspect that letting GPG do the compression and shipping foo.tar.gpg
> would work just fine as well,
Good point. If only "tar -xW foo.tar.gpg" automatically verified the
gpg signature, that would work really well indeed. :-)
> I understand that the automated GPG signature k.org used to use on the
> master machine was primarily to protect the copies that the mirrors serve
> from getting tampered after they leave the master machine. Do you happen
> to know what the new policy will be? Will the developers who distribute
> their snapshot tarballs from the site be GPG signing them themselves
> before uploading?
This is still being negotiated. Given that developers are starting to
sign their release tags (and of course Linus has been doing this
already), one of the things that I've proposed is that we support is
to have the developer do something like this:
git archive --format=tar -o e2fsprogs-1.41.12.tar v1.41.12
gzip -9n e2fsprogs-1.41.12.tar
gpg --sign --detach -a e2fsprogs-1.41.12.tar.gz
and then just uploading the tar.gz.gpg file, the URL for the git tree,
and the tag that the server should use do the extraction.
> That would improve the situation (I suspect that there
> were some people who misunderstood that these GPG signature were to
> protect against break-in at the master machine), but at the same time, it
> may create the chicken-and-egg bootstrapping problem if public keys of too
> many people need to be published securely.
We are in the process of bootstrapping a GPG web of trust. Linus has
generated a new GPG key which has been signed by Peter Anvin, Dirk,
and myself. We'll get a much richer set of cross signatures at the
Kernel Summit in Prague in a few months.
Also, there's a pretty good intersection between kernel developers and
the Debian web of trust; there's been some talk of using that as an
auxiliary bootstrap for isolated kernel developers in distant part of
the world.
- Ted
^ permalink raw reply
* Re: Git is not scalable with too many refs/*
From: Julian Phillips @ 2011-09-29 2:19 UTC (permalink / raw)
To: Martin Fick; +Cc: Christian Couder, git, Christian Couder, Thomas Rast
In-Reply-To: <960aacbf-8d4d-4b2a-8902-f6380ff9febd@email.android.com>
On Wed, 28 Sep 2011 19:37:18 -0600, Martin Fick wrote:
> On Wednesday 28 September 2011 18:59:09 Martin Fick wrote:
>> Julian Phillips <julian@quantumfyre.co.uk> wrote:
-- snip --
>> I've created a test repo with ~100k refs/changes/... style refs, and
>> ~40000 refs/heads/... style refs, and checkout can walk the list of
>> ~140k refs seven times in 85ms user time including doing whatever
>> other
>> processing is needed for checkout. The real time is only 114ms - but
>> then my test repo has no real data in.
>
> If I understand what you are saying, it sounds like you do not have a
> very good test case. The amount of time it takes for checkout depends
> on how long it takes to find a ref with the sha1 that you are on. If
> that sha1 is so early in the list of refs that it only took you 7
> traversals to find it, then that is not a very good testcase. I think
> that you should probably try making an orphaned ref (checkout a
> detached head, commit to it), that is probably the worst testcase
> since it should then have to search all 140K refs to eventually give
> up.
>
> Again, if I understand what you are saying, if it took 85ms for 7
> traversals, then it takes approximately 10ms per traversal, that's
> only 100/s! If you have to traverse it 140K times, that should work
> out to 1400s ~ 23mins.
Well, it's no more than 10ms per traversal - since the rest of the work
presumably takes some time too ...
However, I had forgotten to make the orphaned commit as you suggest -
and then _bang_ 7N^2, it tries seven different variants of each ref
(which is silly as they are all fully qualified), and with packed refs
it has to search for them each time, all to turn names into hashes that
we already know to start with.
So, yes - it is that list traversal.
Does the following help?
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 5e356a6..f0f4ca1 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -605,7 +605,7 @@ static int add_one_ref_to_rev_list_arg(const char
*refname,
int flags,
void *cb_data)
{
- add_one_rev_list_arg(cb_data, refname);
+ add_one_rev_list_arg(cb_data, strdup(sha1_to_hex(sha1)));
return 0;
}
--
Julian
^ permalink raw reply related
* What's next for "signed push"?
From: Junio C Hamano @ 2011-09-29 3:42 UTC (permalink / raw)
To: Git Mailing List; +Cc: Robin H. Johnson
In-Reply-To: <7v62kc1v7m.fsf@alter.siamese.dyndns.org>
Junio C Hamano <gitster@pobox.com> writes:
> "Robin H. Johnson" <robbat2@gentoo.org> writes:
>
>> from CVS to Git (we're very close now), we've decided that the signed
>> pushes will provide better security than our plan of previous plan of
>> using signed notes, so we'd like to see signed pushes succeed.
>
> Could you elaborate on your "previous plan" a bit? What is a signed note,
> how would it help validate the authenticity, how do developers interact
> using it and what do you perceive as weaknesses compared to the signed
> push that we discussed a few weeks ago?
I was hoping that I could get another food-for-thought from people who
thought about signed commits before sending this out, but here is my
current thinking (I retitled your "Signed push progress" as there is
nothing to "progress" on without an active discussion).
Originally, I very much wanted to like the approach of v3 that was meant
to be simpler by having the logic to record the signed push certificate
only on the sending end. At the mechanism level, v3 looked simpler to me,
but from the point of view of end users, I doubt that it is simpler than
the approach of v2, where the sender prepares a push record, signs and
sends it, and the receiver records it to its notes tree to publish so that
others can fetch and verify.
Here are some of the issues, from end users' point of view, that v3 would
have that v2 would not, off the top of my head:
- Unless you are pushing into a repository solely for your own push, you
have to first fetch the notes ref from where you are about to push to,
then hope that your push does not conflict with others. If your push is
rejected for non-fast-forward of the signed-push notes tree (but not
for your real branches), you would have to rewind the push certificate
the failed "git push" prepared (you could probably add a patch to the
v3 to do so automatically, but I haven't looked closely for all
possible failure cases), run "git fetch", and then run "git push -s"
again which would create a new push record for you to sign. Because the
"signed-push" namespace for notes is meant to cover all the branches,
this will not work on a busy site that uses CVS/SVN style "shared
central repository" workflow at all.
- If you are pushing into multiple places, you would somehow need to
configure your end to keep one signed-push notes tree per remote that
you intend to push to with signature, to avoid contaminating remote
repositories of records of your push into other remote repositories.
It could be worked around by even more code on the sending end, but the
need for configuring alone is already an additional mental burden.
- It also was hoped that pre-receive or pre-update hook on the receiving
end can be used to authenticate and authorize the push itself with the
approach by v3, but when the check happens, the signed-notes tree to be
used for verification is not connected to any ref in the refs/notes/
hierarchy yet (otherwise it won't be pre-* hook). The query interface
"git notes show" needs to be updated so that it takes not just a ref
via the GIT_NOTES_REF interface, which is defined to specify a ref
because some subcommands of "git notes" need to create a new commit and
update it, but a bare notes tree commit object name [*1*]. We may need
to update "git notes" (at least "show" subcommand) for the use of
receiving end; v3 is no longer a simpler "sender only" solution.
I've shown how both v2 and v3 models would look to the end users with
working code, thought about the pros and cons probably longer than anybody
else, and at this point, if I were to choose between the two approaches
[*2*], I am inclined to suggest that we go with the v2 model [*3*].
Either that, or we will see follow-up patches to work around the above
(and there may be others we may later discover) issues from people who
still think v3 is a better approach.
Whether we go with v2 or v3, for people who want to verify the commits
against signed push certificates stored in notes tree:
- We need a wrapper like "tag --verify".
- We also need a way to merge these signed pushed certificates [*4*]. I
think the default notes merge is to concatenate, which would result in
duplicates of the records that was present in the common ancestor (and
no, "union" merge is not a safe way to remove these duplicates).
But see footnote *2* below.
[Footnotes]
*1* I wouldn't be surprised if it already worked when you give the object
name of the notes-tree commit to GIT_NOTES_REF when running "git show",
but that is not really a documented interface and working by accident. The
environment variable was designed to take a name of the ref.
*2* I say "if I were to choose between the two" for a reason. It will make
things simpler if we drop "add signature separately to notes" altogether,
and instead adopt a "signed commit" approach. Embed GPG signature in a
commit object, and allow the receiving end of the "push" to be configured
to reject a push that tries to place a non-signed commit at the tip of a
ref. The same "tip of a ref must be a signed commit" check can be done for
"fetch".
The most attractive part of the "signed commit" approach is that it does
not force Linus to fetch push-signature notes trees from his lieutenants,
and merge them to his push-signature notes tree, which is an unnecessary
chore. The most likely thing to happen, especially under v3 design, would
be that higher level maintainers will not bother to fetch/verify/merge the
signed-push notes trees from their feeders, and the final publishing site
will only have the push certificates from the owner of the repository at
the top-level, without downstream contributors' signature. The v2 design
already relies on the final verifier to independently collect signed-push
notes from publishing repositories of Linus and all the key repositories
Linus pulled from before verification, which feels more cumbersome, but
the same needs to be done in v3 if the higher level maintainers do not buy
in the fetch/verify/merge overhead for the push-signature notes tree.
If signatures are embeded in the commits themselves, the issue of merging
push-signature notes tree disappears. Whenever the top level maintainer
pulls from his lieutenants, his fetch can (and should) check the signature
of these lieutenants, and their signatures stay in the history the top
level maintainer integrates and eventually pushes out with his own
signature.
As to the embedding of the signature in the commit, I am inclined to put
the lines of GPG detached signature in new header lines, after the
standard tree/parent/author/committer headers, instead of tucking it at
the end of the commit log message text, for multiple reasons:
- The signature won't clutter output from "git log" and friends if it is
in the extra header. If we place it at the end of the log message, we
would need to teach "git log" and friends to strip the signature block
with an option.
- Teaching new versions of "git log" and "gitk" to optionally verify and
show signatures is cleaner if we structurally know where the signature
block is (instead of scanning in the commit log message).
- The signature needs to be stripped upon various commit rewriting
operations, e.g. rebase, filter-branch, etc. They all already ignore
unknown headers, but if we place signature in the log message, all of
these tools (and third-party tools) also need to learn how a signature
block would look like.
- When we added the optional encoding header, all the tools (both in tree
and third-party) that acts on the raw commit object should have been
fixed to ignore headers they do not understand, so it is not like that
new header would be more likely to break than extra text in the commit.
*3* Honestly speaking, I myself was disturbed by the v2 model where the
signed-push is recorded primarily at the receiver. At the philosophical
level, the approach seems to go very much against the "distributed" nature
of Git. Sending what you want to be committed to the server and having the
server make a commit feels so very SVN/CVS. The usual "push" workflow for
Git users is to fetch first to come close to the other end, integrate your
work to prepare what you push out contains all of what the other end has,
and then pushing the result out, hoping that you are the latest and nobody
else had a chance to update the same thing.
But after thinking about it a bit more, I came to realize that the record
of push is quite unlike the branches you and others work on. First of all,
you are recording what happens at the receiving end, "I updated these refs
to these values in this repository". Making the record at the receiving
end is more natural than writing "I plan to update these refs", sending it
over to a dumb receiver and hoping it will fast-forward.
Don't get me wrong. The offline distributed workflow is a great enabler in
a distributed system like Git. The work you do on your branches can be
fully asynchronous to outside world, and being able to have local history
that can later be merged in order to avoid losing work by others while
still allowing us to be asynchronous is a great thing to have, but the act
of pushing the end result and recording the fact you pushed them into a
repository is inherently a synchronous event---you and the receiving
repository have to be connected when your "push" happens. I do not see a
need to be dogmatic and insist that everything we do is asynchronous.
*4* For the purpose of pushing things out, even with the v3 design, a
pusher does not have to worry about merging the signed-push notes tree
(there is no merge issue for pushers with the v2 model), I think.
"git push -s" will add a push record to the notes tree, and if the result
does not fast forward, "git fetch $remote +refs/notes/signed-push" (or use
per-remote signed-push hierarchy "+refs/notes/$remote/signed-push") that
discards the single failed push record would be all that is needed before
attempting "git push -s" again without losing any information.
^ permalink raw reply
* Re: Lack of detached signatures
From: Junio C Hamano @ 2011-09-29 3:50 UTC (permalink / raw)
To: Ted Ts'o
Cc: Jeff King, Joseph Parmelee, Carlos Martín Nieto,
Olsen, Alan R, Michael Witten, git@vger.kernel.org
In-Reply-To: <20110929015919.GL19250@thunk.org>
Ted Ts'o <tytso@mit.edu> writes:
>> That would improve the situation (I suspect that there
>> were some people who misunderstood that these GPG signature were to
>> protect against break-in at the master machine), but at the same time, it
>> may create the chicken-and-egg bootstrapping problem if public keys of too
>> many people need to be published securely.
>
> We are in the process of bootstrapping a GPG web of trust. Linus has
> generated a new GPG key which has been signed by Peter Anvin, Dirk,
> and myself. We'll get a much richer set of cross signatures at the
> Kernel Summit in Prague in a few months.
I was actually more worried about helping consumers convince themselves
that thusly signed keys indeed belong to producers like Linus, Peter,
etc. There are those who worry that DNS record to code.google.com/ for
them may point at an evil place to give them rogue download material.
"Here are the keys you can verify our trees with" message on the mailing
list, even with the message is signed with GPG, would not be satisfactory
to them.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox