Git development
 help / color / mirror / Atom feed
* Re: [PATCH 2/2] builtin/apply.c: report error on failure to recognize input
From: Junio C Hamano @ 2011-12-05 19:27 UTC (permalink / raw)
  To: Brandon Casey; +Cc: git, artem.bityutskiy
In-Reply-To: <1322944550-27344-2-git-send-email-drafnel@gmail.com>

Brandon Casey <drafnel@gmail.com> writes:

> When git apply is passed something that is not a patch, it does not produce
> an error message or exit with a non-zero status if it was not actually
> "applying" the patch i.e. --check or --numstat etc were supplied on the
> command line.
>
> Fix this by producing an error when apply fails to find any hunks whatsoever
> while parsing the patch.
>
> This will cause some of the output formats (--numstat, --diffstat, etc) to
> produce an error when they formerly would have reported zero changes and
> exited successfully.  That seems like the correct behavior though.  Failure
> to recognize the input as a patch should be an error.
>
> Plus, add a test.
>
> Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> Signed-off-by: Brandon Casey <drafnel@gmail.com>
> ---
>
> Initially, I was reluctant to change the error message, thinking that
> error messages for plumbing commands were not supposed to change.  But I
> think I was wrong in that thought, so I changed the error message so it
> was a more descriptive "unrecognized input".

I am still reluctant to see

    $ git apply </dev/null
    error: unrecognized input

instead of "error: No changes", though.

"git apply --check" is about asking "do you see anything offending in the
diff?" and it is not "git apply --dry-run" that asks "do you promise if I
feed this for real to you you will apply it without complaint?".

I am slightly in favor of answering "well you do not have a diff to begin
with, which in itself is suspicious" to "do you see anything offending?"
question, but I have to admit that it is an equally valid answer to say
"no, there is nothing offending in the diff.", which is what we do with
the current code.

So, I dunno.

^ permalink raw reply

* hooks that do not consume stdin sometimes crash git with SIGPIPE
From: Joey Hess @ 2011-12-05 19:29 UTC (permalink / raw)
  To: git; +Cc: Lars Wirzenius
In-Reply-To: <20110829203107.GA4946@gnu.kitenet.net>

[-- Attachment #1: Type: text/plain, Size: 1638 bytes --]

We had a weird problem where, after moving to a new, faster server,
"git push" would sometimes fail like this:

Unpacking objects: 100% (3/3), done.
fatal: The remote end hung up unexpectedly
fatal: The remote end hung up unexpectedly

Turns out that git-receive-pack was dying due to an uncaught SIGPIPE.
The SIGPIPE occurred when it tried to write to the pre-receive hook's
stdin. The pre-receive hook, in this case, was able to do all the checks
it needed to do[1] without the input, and so did exit(0) without
consuming it.

Apparently that causes a race. Most of the time, git forks the hook,
writes output to the hook, and then the hook runs, ignores it, and exits.
But sometimes, on our new faster (and SMP) server, git forked the hook, and
it ran, and exited, before git got around to writing to it, resulting in
the SIGPIPE.

write(7, "c9f98c67d70a1cfeba382ec27d87644a"..., 100) = -1 EPIPE (Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---

I think git should ignore SIGPIPE when writing to hooks. Otherwise,
hooks may have to go out of their way to consume all input, and as I've
seen, the races when they fail to do this can lurk undiscovered.

Note that I encountered this same sort of problem from another direction
(involving smudge filters) not long ago, and sent a patch, in
<20110829203107.GA4946@gnu.kitenet.net>. That wasn't applied, and is
in different code than the case I outlined above.

-- 
see shy jo

[1] If you're wondering, it only needed to check that the push was
    coming from a trusted UID. With an untrusted UID, it did further
    checks that consumed the stdin.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* Roadmap for 1.7.9
From: Junio C Hamano @ 2011-12-05 20:07 UTC (permalink / raw)
  To: git

I do not do roadmaps, so the title of this message is not quite correct,
but I think I should post my current thinking on what have been cooking so
far that deserve to be polished further to be in the next release.

I've been aiming for a release cycle that lasts for 8 to 10 weeks, and the
last 2 or 3 weeks of a cycle is meant for release candidate testing that
we fix only regressions, which means we need to see how much we can do
within 5 to 8 weeks. The current cycle is expected to end at the end of
January next month [*1*].

I am not so worried about small and obvious fixes and enhancements, and
changes in the periphery areas. They can and will be merged as they come
and cook long enough in 'next' without issues reported by their users. I
expect what have been cooking in 'next' during the feature-freeze before
the 1.7.8 release, other than the ones listed below, to be more or less
ready already and they should be in 'master' shortly.

Now, here are the biggies that we would want to try to have in reasonable
shape before the next release. The list may be a bit too ambitious, given
that this cycle overlaps with end-of-this-year/beginning-of-new-year
holiday season in various cultures.

 * Credentials and keychain (Peff)
 * Pulling signed tag (me, Linus)
 * Update "request-pull" script with information that matters (me)
 * Revisiting threading of grep (Rene, Thomas Rast)
 * Optimization of reading hierarchical refspace lazily (Michael Haggerty)

I expect the following will not make much progress without further
discussion:

 * Signed commit (me)
 * Ignored vs Precious (Nguyen Thai Ngoc Duy)
 * Sequencer (Ram, Jonathan Nieder)

I think the following are too big to be ready by the end of this cycle
(polishing could and will continue as time permits).

 * Large-files
  - bulk check-in (me)
  - "Chunked" encoding of large files (me)


[Reference]

*1* https://www.google.com/calendar/embed?src=jfgbl2mrlipp4pb6ieih0qr3so%40group.calendar.google.com&ctz=America/Los_Angeles

^ permalink raw reply

* Re: [PATCH v2 0/3] grep multithreading and scaling
From: Thomas Rast @ 2011-12-05 20:16 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, Eric Herman, git, Junio C Hamano
In-Reply-To: <201112051038.16423.trast@student.ethz.ch>

Thomas Rast wrote:
> 
> I just found out that on Linux, there's mincore() that can tell us
> (racily, but who cares) whether a given file mapping is in memory.
[...]
> So that looks fairly promising, and the order would then be:
> 
> - if stat-clean, and we have mincore(), and it tells us we can do it
>   cheaply: grab file from tree
> 
> - if it's a loose object: decompress it
> 
> - if stat-clean: grab file from tree
> 
> - access packs as usual

Just a small note, I tried two things:

* the simpler option of grabbing a loose object if it exists and is
  mincore() turns out to massively slow down 'git log HEAD', probably
  because only very few of these objects are loose in the first place

* doing this only under grep's use_threads, and dropping the lock
  around unpack_sha1_file() [i.e., zlib decompression] still results
  in a git-grep that is slower than without this, though not much

So no improvement here.  Will have to look into the worktree trick
though.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply

* Re: Roadmap for 1.7.9
From: Ævar Arnfjörð Bjarmason @ 2011-12-05 21:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <7vd3c2lr36.fsf@alter.siamese.dyndns.org>

On Mon, Dec 5, 2011 at 21:07, Junio C Hamano <gitster@pobox.com> wrote:

> Now, here are the biggies that we would want to try to have in reasonable
> shape before the next release.

I'd like to get the i18n series into 1.7.9 as well. I think it's ready
as-is but some minor issues are sure to arise.

If time permits I'd also like to have a series of po/*.po files in as
well once it's in "master". Maybe as a submodule, which would be neat
in itself as we'd start dogfooding submodules.

^ permalink raw reply

* Re: hooks that do not consume stdin sometimes crash git with SIGPIPE
From: Jeff King @ 2011-12-05 21:43 UTC (permalink / raw)
  To: Joey Hess; +Cc: git, Lars Wirzenius
In-Reply-To: <20111205192930.GA32463@gnu.kitenet.net>

On Mon, Dec 05, 2011 at 03:29:30PM -0400, Joey Hess wrote:

> We had a weird problem where, after moving to a new, faster server,
> "git push" would sometimes fail like this:
> 
> Unpacking objects: 100% (3/3), done.
> fatal: The remote end hung up unexpectedly
> fatal: The remote end hung up unexpectedly
> 
> Turns out that git-receive-pack was dying due to an uncaught SIGPIPE.
> The SIGPIPE occurred when it tried to write to the pre-receive hook's
> stdin. The pre-receive hook, in this case, was able to do all the checks
> it needed to do[1] without the input, and so did exit(0) without
> consuming it.

Ugh. Yeah, receive-pack should probably just be ignoring SIGPIPE
entirely. It checks the return from write() properly where it matters,
so SIGPIPE is just an annoyance.

I would go so far as to say that git should be ignoring SIGPIPE 99% of
the time. We have crap like this sprinkled throughout the code base:

  $ git grep -C1 SIGPIPE
  builtin/tag.c-  /* When the username signingkey is bad, program could be terminated
  builtin/tag.c:   * because gpg exits without reading and then write gets SIGPIPE. */
  builtin/tag.c:  signal(SIGPIPE, SIG_IGN);
  [...]
  upload-pack.c-   * If rev-list --stdin encounters an unknown commit, it
  upload-pack.c:   * terminates, which will cause SIGPIPE in the write loop
  upload-pack.c-   */
  upload-pack.c:  sigchain_push(SIGPIPE, SIG_IGN);

but I find it highly unlikely that they are covering all of the cases.
You found one already, and these things can quite often be
race-condition heisenbugs.

The one place where SIGPIPE _is_ useful is for things like "git log"
which are just dumping to a pager over stdout. When the pager dies, we
can stop bothering to produce output.

It would be really nice if we could write a sigpipe handler that knew
which fd caused the the signal, and then we could do something like:

  void sigpipe_handler(int sig)
  {
          /* If we're writing to a pager over stdout, then there is
           * little use in writing more; nobody is interested in our
           * output. */
          if (get_fd_that_caused_sigpipe() == 1 && pager_in_use)
                  exit(1);

          /* Otherwise, ignore it, as it's a write to some auxiliary
           * process and we will be careful about checking the return
           * code from write(). */
  }

But I don't think such a function exists. We could just check
pager_in_use, which would cover most cases (e.g., programs like
receive-pack don't use a pager). But it would fail in the case of
something like "git log" using an auxiliary process that closes the pipe
early. Maybe that would be good enough, though. I dunno.

Another option is to just ignore SIGPIPE entirely, and convince programs
like log to actually bother checking the result of write(). It would be
a slight pain to check every printf() call we make in log-tree.c, but we
could do one of:

  1. Make a set of stdio wrapper scripts that exit gracefully on EPIPE.
     Using the wrapper scripts everywhere is a slight pain, but would
     work pretty well, and adapts easily to other places like printing
     lists of refs, etc.

  2. Don't check every printf. After printing each commit, check
     ferror(stdout) and exit as appropriate. This is a very small amount
     of code, but you'd need to do it in several places (i.e., anywhere
     that produces a lot of output).

> I think git should ignore SIGPIPE when writing to hooks. Otherwise,
> hooks may have to go out of their way to consume all input, and as I've
> seen, the races when they fail to do this can lurk undiscovered.

Yeah, certainly. The question to me is whether we should just stick a
SIG_IGN in the beginning of receive-pack, or whether we should try to
deal with this problem everywhere.

For example, I suspect the same problem exists in the credential helpers
I wrote recently. Generally they will read all of their input, but you
could do something like:

  [credential "https://github.com"]
          username = peff
          helper = "!
                  f() {
                          # if we call this helper, we know we want the
                          # github password,  or else this helper config
                          # would never have been triggered.  so we
                          # don't even have to bother reading our stdin.

                          # We only handle getting the password.
                          test "$1" = "get" || return

                          # Presumably gpg-agent will ask for and cache
                          # your gpg password.
                          p=`gpg -qd --no-tty <~/.github-password.gpg`

                          echo "password=$p"
                  }; f"

This can racily fail if our write happens after the helper has already
finished (unlikely, but possible).

-Peff

^ permalink raw reply

* Re: [PATCH, v4] git-tag: introduce --cleanup option
From: Jeff King @ 2011-12-05 21:51 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: git, Junio C Hamano
In-Reply-To: <1322972426-7591-1-git-send-email-kirill@shutemov.name>

On Sun, Dec 04, 2011 at 06:20:26AM +0200, Kirill A. Shutemov wrote:

> From: "Kirill A. Shutemov" <kirill@shutemov.name>
> 
> Normally git tag stripes tag message lines starting with '#', trailing
> spaces from every line and empty lines from the beginning and end.

s/stripes/strips

> --cleanup allows to select different cleanup modes for tag message.
> It provides the same interface as --cleanup option in git-commit.

Thanks, I think this is better, though it would be better still if they
could share the code. As a minor nit, I think the patch would be a
little easier to read and review if you split the actual changes from
the refactoring to use the "struct create_tag_options".

More importantly, though, this seems to break t6300 badly. I haven't
looked into why yet, though.

-Peff

^ permalink raw reply

* Re: [PATCH, v4] git-tag: introduce --cleanup option
From: Jeff King @ 2011-12-05 22:27 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: git, Junio C Hamano
In-Reply-To: <1322972426-7591-1-git-send-email-kirill@shutemov.name>

On Sun, Dec 04, 2011 at 06:20:26AM +0200, Kirill A. Shutemov wrote:

> @@ -367,14 +390,15 @@ static void create_tag(const unsigned char *object, const char *tag,
> [...]
> -	if (!message && !buf->len)
> +	if (opt->message && !buf->len)
>  		die(_("no tag message?"));

Ah, this is the hunk that causes t6300 to fail. You accidentally removed
the negation when converting the "message" flag over.

This was much easier to find by splitting the refactoring (where the bug
is) away from the new feature (which is where I assumed the bug was).
Here's the first half of the "split". You can rebase your original patch
on top to get the second half.

I also looked at factoring out the "which cleanup mode to select" logic
from builtin/commit.c, but it turned out to just make things harder to
follow.

-- >8 --
From: "Kirill A. Shutemov" <kirill@shutemov.name>
Subject: [PATCH] tag: refactor passing tag creation options

Rather than continually adding parameters to the create_tag
function, we can put all of the flags in a struct.

Signed-off-by: Jeff King <peff@peff.net>
---
Actually, I'm not sure this really buys us much as a refactoring. It
saves a parameter in the function, but it's not like we end up passing
all those parameters to sub-functions, where something like this would
increase readability. I'm fine with this, but I'd also be fine with just
dropping this half and passing the cleanup_mode parameter directly.

 builtin/tag.c |   32 ++++++++++++++++++++------------
 1 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 9b6fd95..e5bd708 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -319,8 +319,13 @@ static int build_tag_object(struct strbuf *buf, int sign, unsigned char *result)
 	return 0;
 }
 
+struct create_tag_options {
+	unsigned int message;
+	unsigned int sign;
+};
+
 static void create_tag(const unsigned char *object, const char *tag,
-		       struct strbuf *buf, int message, int sign,
+		       struct strbuf *buf, struct create_tag_options *opt,
 		       unsigned char *prev, unsigned char *result)
 {
 	enum object_type type;
@@ -345,7 +350,7 @@ static void create_tag(const unsigned char *object, const char *tag,
 	if (header_len > sizeof(header_buf) - 1)
 		die(_("tag header too big."));
 
-	if (!message) {
+	if (!opt->message) {
 		int fd;
 
 		/* write the template message before editing: */
@@ -369,12 +374,12 @@ static void create_tag(const unsigned char *object, const char *tag,
 
 	stripspace(buf, 1);
 
-	if (!message && !buf->len)
+	if (!opt->message && !buf->len)
 		die(_("no tag message?"));
 
 	strbuf_insert(buf, 0, header_buf, header_len);
 
-	if (build_tag_object(buf, sign, result) < 0) {
+	if (build_tag_object(buf, opt->sign, result) < 0) {
 		if (path)
 			fprintf(stderr, _("The tag message has been left in %s\n"),
 				path);
@@ -422,9 +427,9 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
 	unsigned char object[20], prev[20];
 	const char *object_ref, *tag;
 	struct ref_lock *lock;
-
-	int annotate = 0, sign = 0, force = 0, lines = -1,
-		list = 0, delete = 0, verify = 0;
+	struct create_tag_options opt;
+	int annotate = 0, force = 0, lines = -1, list = 0,
+	    delete = 0, verify = 0;
 	const char *msgfile = NULL, *keyid = NULL;
 	struct msg_arg msg = { 0, STRBUF_INIT };
 	struct commit_list *with_commit = NULL;
@@ -442,7 +447,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
 		OPT_CALLBACK('m', "message", &msg, "message",
 			     "tag message", parse_msg_arg),
 		OPT_FILENAME('F', "file", &msgfile, "read message from file"),
-		OPT_BOOLEAN('s', "sign", &sign, "annotated and GPG-signed tag"),
+		OPT_BOOLEAN('s', "sign", &opt.sign, "annotated and GPG-signed tag"),
 		OPT_STRING('u', "local-user", &keyid, "key-id",
 					"use another key to sign the tag"),
 		OPT__FORCE(&force, "replace the tag if exists"),
@@ -459,13 +464,15 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
 
 	git_config(git_tag_config, NULL);
 
+	opt.sign = 0;
+
 	argc = parse_options(argc, argv, prefix, options, git_tag_usage, 0);
 
 	if (keyid) {
-		sign = 1;
+		opt.sign = 1;
 		set_signingkey(keyid);
 	}
-	if (sign)
+	if (opt.sign)
 		annotate = 1;
 	if (argc == 0 && !(delete || verify))
 		list = 1;
@@ -523,9 +530,10 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
 	else if (!force)
 		die(_("tag '%s' already exists"), tag);
 
+	opt.message = msg.given || msgfile;
+
 	if (annotate)
-		create_tag(object, tag, &buf, msg.given || msgfile,
-			   sign, prev, object);
+		create_tag(object, tag, &buf, &opt, prev, object);
 
 	lock = lock_any_ref_for_update(ref.buf, prev, 0);
 	if (!lock)
-- 
1.7.8.rc4.4.g884ec

^ permalink raw reply related

* Re: [PATCH, v4] git-tag: introduce --cleanup option
From: Jeff King @ 2011-12-05 22:29 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: git, Junio C Hamano
In-Reply-To: <20111205222724.GA7603@sigill.intra.peff.net>

On Mon, Dec 05, 2011 at 05:27:24PM -0500, Jeff King wrote:

> I also looked at factoring out the "which cleanup mode to select" logic
> from builtin/commit.c, but it turned out to just make things harder to
> follow.

While I was doing that, I also noticed this minor fix:

-- >8 --
Subject: [PATCH] stripspace: fix outdated comment

The comment on top of stripspace() claims that the buffer
will no longer be NUL-terminated. However, this has not been
the case at least since the move to using strbuf in 2007.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/stripspace.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/builtin/stripspace.c b/builtin/stripspace.c
index 4d3b93f..1288ffc 100644
--- a/builtin/stripspace.c
+++ b/builtin/stripspace.c
@@ -22,8 +22,6 @@ static size_t cleanup(char *line, size_t len)
  * Remove empty lines from the beginning and end
  * and also trailing spaces from every line.
  *
- * Note that the buffer will not be NUL-terminated.
- *
  * Turn multiple consecutive empty lines between paragraphs
  * into just one empty line.
  *
-- 
1.7.8.rc4.4.g884ec

^ permalink raw reply related

* Re: [PATCH, v4] git-tag: introduce --cleanup option
From: Junio C Hamano @ 2011-12-05 22:30 UTC (permalink / raw)
  To: Jeff King; +Cc: Kirill A. Shutemov, git
In-Reply-To: <20111205215148.GA22663@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> More importantly, though, this seems to break t6300 badly. I haven't
> looked into why yet, though.

Also breaks 7004 which is _about_ tags.

Rolling a broken patch in quick succession to v4 without ever running
tests (and not adding new test pieces to protect new feature) is not a
very productive way to use the reviewer bandwidth on this list.

^ permalink raw reply

* Re: [PATCH 2/2] builtin/apply.c: report error on failure to recognize input
From: Brandon Casey @ 2011-12-05 22:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, artem.bityutskiy
In-Reply-To: <7vzkf6lsyx.fsf@alter.siamese.dyndns.org>

On Mon, Dec 5, 2011 at 1:27 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Brandon Casey <drafnel@gmail.com> writes:
>
>> When git apply is passed something that is not a patch, it does not produce
>> an error message or exit with a non-zero status if it was not actually
>> "applying" the patch i.e. --check or --numstat etc were supplied on the
>> command line.
>>
>> Fix this by producing an error when apply fails to find any hunks whatsoever
>> while parsing the patch.
>>
>> This will cause some of the output formats (--numstat, --diffstat, etc) to
>> produce an error when they formerly would have reported zero changes and
>> exited successfully.  That seems like the correct behavior though.  Failure
>> to recognize the input as a patch should be an error.
>>
>> Plus, add a test.
>>
>> Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
>> Signed-off-by: Brandon Casey <drafnel@gmail.com>
>> ---
>>
>> Initially, I was reluctant to change the error message, thinking that
>> error messages for plumbing commands were not supposed to change.  But I
>> think I was wrong in that thought, so I changed the error message so it
>> was a more descriptive "unrecognized input".
>
> I am still reluctant to see
>
>    $ git apply </dev/null
>    error: unrecognized input
>
> instead of "error: No changes", though.

I'm not partial to "unrecognized input", but I thought it was more
descriptive of what happened than "No changes".  This error message is
only printed out when absolutely no hunks were found while parsing the
input.

> "git apply --check" is about asking "do you see anything offending in the
> diff?" and it is not "git apply --dry-run" that asks "do you promise if I
> feed this for real to you you will apply it without complaint?".
>
> I am slightly in favor of answering "well you do not have a diff to begin
> with, which in itself is suspicious" to "do you see anything offending?"
> question, but I have to admit that it is an equally valid answer to say
> "no, there is nothing offending in the diff.", which is what we do with
> the current code.
>
> So, I dunno.

I think the current code is a little inconsistent with respect to
empty or bogus non-diff input.

It seems more consistent that if it is an error to tell git apply to
apply zero hunks, then it is also an error to --check zero hunks, or
--stat etc.  In all cases the cause is the same: failure to find any
hunks in the input because the input was not a diff.

Also, the man page description of --check says that it checks "if the
patch is applicable to the current working tree and/or the index".
The new behavior would answer that with "no, this patch is not
applicable ... since no hunks were found", rather than "yes, because
no hunks were found".  But I'm really arguing on the side of
"unrecognized input should be an error", since the new behavior would
also be an error for --stat, --numstat, etc.

-Brandon

^ permalink raw reply

* Re: [PATCH, v4] git-tag: introduce --cleanup option
From: Junio C Hamano @ 2011-12-05 22:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Kirill A. Shutemov, git
In-Reply-To: <20111205215148.GA22663@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> More importantly, though, this seems to break t6300 badly. I haven't
> looked into why yet, though.

Probably two issues.

 - opt.message (and the original 'message') was misnamed and confused the
   patch author what "if (!message && !buf->len)" meant.

 - "opt" is a structure meant to be extensible, but is not initialized as
   a whole, inviting future errors.

It still seems to be broken with respect to the primary thing the patch
wanted to do (t7400 "git tag -F commentsfile comments-annotated-tag" does
not seem to produce an expected result), so I'll kick it back to the
Kirill to look at.

Thanks.

 builtin/tag.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 27a66a3..7883720 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -329,7 +329,7 @@ static int build_tag_object(struct strbuf *buf, int sign, unsigned char *result)
 }
 
 struct create_tag_options {
-	unsigned int message;
+	unsigned int message_given:1;
 	unsigned int sign;
 	enum {
 		CLEANUP_NONE,
@@ -364,7 +364,7 @@ static void create_tag(const unsigned char *object, const char *tag,
 	if (header_len > sizeof(header_buf) - 1)
 		die(_("tag header too big."));
 
-	if (!opt->message) {
+	if (!opt->message_given) {
 		int fd;
 
 		/* write the template message before editing: */
@@ -393,7 +393,7 @@ static void create_tag(const unsigned char *object, const char *tag,
 	if (opt->cleanup_mode != CLEANUP_NONE)
 		stripspace(buf, opt->cleanup_mode == CLEANUP_ALL);
 
-	if (opt->message && !buf->len)
+	if (!opt->message_given && !buf->len)
 		die(_("no tag message?"));
 
 	strbuf_insert(buf, 0, header_buf, header_len);
@@ -486,7 +486,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
 
 	git_config(git_tag_config, NULL);
 
-	opt.sign = 0;
+	memset(&opt, 0, sizeof(opt));
 
 	argc = parse_options(argc, argv, prefix, options, git_tag_usage, 0);
 
@@ -552,10 +552,10 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
 	else if (!force)
 		die(_("tag '%s' already exists"), tag);
 
-	opt.message = msg.given || msgfile;
+	opt.message_given = msg.given || msgfile;
 
 	if (!cleanup_arg || !strcmp(cleanup_arg, "default"))
-		opt.cleanup_mode = !opt.message ? CLEANUP_ALL : CLEANUP_SPACE;
+		opt.cleanup_mode = !opt.message_given ? CLEANUP_ALL : CLEANUP_SPACE;
 	else if (!strcmp(cleanup_arg, "verbatim"))
 		opt.cleanup_mode = CLEANUP_NONE;
 	else if (!strcmp(cleanup_arg, "whitespace"))

^ permalink raw reply related

* Re: Roadmap for 1.7.9
From: Junio C Hamano @ 2011-12-05 22:47 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List
In-Reply-To: <CACBZZX6aC-E4DxaZzzhfGnK8ovBGCq_gNG3hPU7QjfAiNb3WrA@mail.gmail.com>

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Mon, Dec 5, 2011 at 21:07, Junio C Hamano <gitster@pobox.com> wrote:
>
>> Now, here are the biggies that we would want to try to have in reasonable
>> shape before the next release.
>
> I'd like to get the i18n series into 1.7.9 as well. I think it's ready
> as-is but some minor issues are sure to arise.

Surely.

My impression was that the part that can have interactions with the
existing codebase is already in, and it is just the matter of updating
what _() ans Q_() actually do, which can be reverted out quickly if it
turns out to be necessary, and that was why I didn't count it as part of
the "biggies" that make us worry.

^ permalink raw reply

* Yo dawg, I heard you like trees...
From: Sebastian Morr @ 2011-12-05 23:57 UTC (permalink / raw)
  To: git

Just out of curiosity: Do you know of any attempts to construct a tree
object that contains itself, that is, references it's own SHA-1?

Sebastian

^ permalink raw reply

* Re: Roadmap for 1.7.9
From: Ævar Arnfjörð Bjarmason @ 2011-12-06  0:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <7vr50ik55n.fsf@alter.siamese.dyndns.org>

On Mon, Dec 5, 2011 at 23:47, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Mon, Dec 5, 2011 at 21:07, Junio C Hamano <gitster@pobox.com> wrote:
>>
>>> Now, here are the biggies that we would want to try to have in reasonable
>>> shape before the next release.
>>
>> I'd like to get the i18n series into 1.7.9 as well. I think it's ready
>> as-is but some minor issues are sure to arise.
>
> Surely.
>
> My impression was that the part that can have interactions with the
> existing codebase is already in, and it is just the matter of updating
> what _() ans Q_() actually do, which can be reverted out quickly if it
> turns out to be necessary, and that was why I didn't count it as part of
> the "biggies" that make us worry.

That's the gist of it. But it's still a massive 1300 line patch,
which:

 * Calls setlocale(LC_MESSAGES)
 * Uses an evil trick to only call setlocale(LC_CTYPE) while loading
   message catalogs
 * Might have some portability problems on more obscure OS's.
 * Has some tests I've only tested on Linux, FreeBSD and Solaris

So it might have some little surprises, and it would be nice to have
it in master early on so we'll give it plenty of time to cook and
discover any issues before 1.7.9.

^ permalink raw reply

* Re: Yo dawg, I heard you like trees...
From: Andrew Ardill @ 2011-12-06  0:04 UTC (permalink / raw)
  To: Sebastian Morr; +Cc: git
In-Reply-To: <20111205235740.GB27318@thinkpad>

The difficulty in doing this is essentially the same as breaking
SHA-1, so I doubt anyone has seriously tried to do it.

Regards,

Andrew Ardill



On 6 December 2011 10:57, Sebastian Morr <sebastian@morr.cc> wrote:
> Just out of curiosity: Do you know of any attempts to construct a tree
> object that contains itself, that is, references it's own SHA-1?
>
> Sebastian
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* Re: [PATCH v2 0/3] grep multithreading and scaling
From: Jeff King @ 2011-12-06  0:40 UTC (permalink / raw)
  To: Thomas Rast; +Cc: René Scharfe, Eric Herman, git, Junio C Hamano
In-Reply-To: <201112051038.16423.trast@student.ethz.ch>

On Mon, Dec 05, 2011 at 10:38:16AM +0100, Thomas Rast wrote:

> I just found out that on Linux, there's mincore() that can tell us
> (racily, but who cares) whether a given file mapping is in memory.  If
> you would like to try it, see the source at the end, but I'm getting
> things such as

Neat, I didn't know about mincore.

> So that looks fairly promising, and the order would then be:
> 
> - if stat-clean, and we have mincore(), and it tells us we can do it
>   cheaply: grab file from tree
> 
> - if it's a loose object: decompress it
> 
> - if stat-clean: grab file from tree
> 
> - access packs as usual

I don't think your third one makes sense. If the working tree file isn't
stat clean, then either:

  1. the pack file is in cache, and it's way faster than faulting in the
     working tree file from disk

  2. the pack file is not in cache, and it's a toss-up whether it is
     faster to fault in the smaller compressed pack-file version and
     uncompress it, or to fault in the larger on-disk version. The
     exact result will depend on the ratio of CPU to disk speed, the
     quality of your filesystem, and the size and contents of your file.

     And possibly on the exact delta chains you have. Though this
     optimization only happens when the file is in the index, which
     usually means it's recent, which means it will tend to be at the
     head of the delta chain.

So it probably just makes sense to grab the working tree file only if
mincore() tells us we have all (or most) of it, and otherwise go to the
packfile.

> Ok, I see, I missed that part.  Perhaps the heuristic should then be
> "if the regex boils down to memmem, disable threading", but let's see
> what loose object decompression in parallel can give us.

Yeah. I'd really rather have parallel object decompression than some
complex Linux-only mincore optimization (even though that optimization
_could_ yield extra savings on top of properly threading, if the blob
retrieval is threaded, I think I'll care less about how much CPU time it
takes).

-Peff

^ permalink raw reply

* Re: Auto update submodules after merge and reset
From: Max Krasnyansky @ 2011-12-06  1:06 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: git
In-Reply-To: <4ED5E9D2.4060503@web.de>

Hi Jens,

On 11/30/2011 12:31 AM, Jens Lehmann wrote:
> I'm working on a patch series to teach Git to optionally update the 
> submodules work trees on checkout, reset merge and so on, but I'm not 
> there yet.
> [SNIP]

Sorry for not replying right away.
Everything you suggested sounds great. We're on the same page (config 
option, etc).
How far along are you? Do you have a tree I could pull from to play with 
things?
I could help with testing, bug fixes and/or implementing parts of it. 
Let me know.

For now I implemented automatic submodules update using 'post-merge' 
hook. But obviously it does
not handle reset and things. I'm thinking of adding 'post-reset' and 
'pre-merge' that would be useful
for this and maybe other things.

Thanx
Max

^ permalink raw reply

* Re: [RFD] Handling of non-UTF8 data in gitweb
From: Jeff King @ 2011-12-06  1:07 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, Jürgen Kreileder, John Hawley
In-Reply-To: <201112041709.32212.jnareb@gmail.com>

On Sun, Dec 04, 2011 at 05:09:30PM +0100, Jakub Narebski wrote:

> The correct solution would be of course to respect `gui.encoding`
> per-repository config variable, and `encoding` gitattribute...
> though the latter is hampered by the fact that there is currently
> no way to read attribute with "git check-attr" from a given tree:
> think of a diff of change of encoding of a file!

We deal with the same problem at GitHub.

There really isn't a good way to specify per-file encodings. Something
like gui.encoding is too coarse. As you mentioned, we don't do per-tree
gitattribute lookups, so the encoding attribute has problems when the
encoding of a file changes. But even if we implemented them, you still
have the problem of getting a raw sha1 (e.g., git diff 9624865 e0a3260).
There's no way to look up attributes for that.

It would be nice if you could put an "encoding" header into the blob
object. You could use the .gitattributes in place at "git add" time to
set it. And then at lookup time, you either have the encoding, or you
assume it's in utf8 (if it isn't binary, of course).

But there's no room in the blob format for headers; the content starts
right after the size header.

You can get around this by searching the history for a tree that
contains the blob, and then checking the gitattributes. It's expensive,
but you could build a cache over time. However, it's not guaranteed to
provide a single answer; you could have multiple trees that mention the
blobs, each with different attributes.

And even if you implement all that, we have the problem that older blobs
won't have gotten an encoding header, even if they would have under the
new rules. So rather than assuming utf8, you have to make a guess
anyway.

At GitHub, we talked about a lot of these options and ended up just
using an encoding-detection library to make a best guess. It seems to
work well in practice, but it's only been deployed for a couple of
months.

-Peff

^ permalink raw reply

* Re: [PATCH] git-svn.perl: close the edit for propedits even with no mods
From: Eric Wong @ 2011-12-06  1:10 UTC (permalink / raw)
  To: Steven Walter; +Cc: git
In-Reply-To: <1322707047-24227-1-git-send-email-stevenrwalter@gmail.com>

Steven Walter <stevenrwalter@gmail.com> wrote:
> It's legitimate to update the mergeinfo property without
> actually changing any files.  This can happen when changes are
> backported to a branch, and then that branch is merged back
> into mainline.  We still want to record the updated mergeinfo
> for book-keeping.
> 
> Signed-off-by: Steven Walter <stevenrwalter@gmail.com>

Acked-by: Eric Wong <normalperson@yhbt.net>

Pushed to "master" of git://bogomips.org/git-svn.git

(Btw, you got my email address wrong in the Cc:)

^ permalink raw reply

* Re: hooks that do not consume stdin sometimes crash git with SIGPIPE
From: Junio C Hamano @ 2011-12-06  1:39 UTC (permalink / raw)
  To: Joey Hess; +Cc: git, Lars Wirzenius
In-Reply-To: <20111205192930.GA32463@gnu.kitenet.net>

Joey Hess <joey@kitenet.net> writes:

> We had a weird problem where, after moving to a new, faster server,
> "git push" would sometimes fail like this:
>
> Unpacking objects: 100% (3/3), done.
> fatal: The remote end hung up unexpectedly
> fatal: The remote end hung up unexpectedly
>
> Turns out that git-receive-pack was dying due to an uncaught SIGPIPE.

Why do you have a hook that is expected to read from receive-pack that
does _not_ read anything from it in the first place? If you do not care
about the update status given to pre-receive, shouldn't you be using the
update hook and ignoring the command line parameters instead?

I am not saying this is a user configuration error and there is nothing to
fix---Git shouldn't get killed merely because of configuration error.

I am wondering if we would want to have a uniform way to tell run_*_hook()
functions that the hook writer explicitly declines to get any input. E.g.
"hooks/pre-receive-noinput" is called instead of "hooks/pre-receive" and
we do not send any input to it, or something like that.

^ permalink raw reply

* Re: Suggestion on hashing
From: Chris West (Faux) @ 2011-12-06  1:56 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: Bill Zaumen, Jeff King, Git Mailing List
In-Reply-To: <CACsJy8CO1GtpZVo-oA2eKbQadsXYBEKVLfUH0GONR5jovuvH+Q@mail.gmail.com>


Nguyen Thai Ngoc Duy wrote:
> SHA-1 charateristics (like 20 byte length) are hard coded everywhere
> in git, it'd be a big audit.

I was planning to look at this anyway.  My branch[1] allows
  init/add/commit with SHA-256, SHA-512 and all the SHA-3 candidates.

log/fsck/etc. are all broken.  Don't even dare try packs.  Fixing things
  is painful but not impossible.  I'm not convinced the task is even
  remotely insurmountable.

(This is not a request-for-comments, just an informational notification.
  It does not even attempt to address compatability or the like.)

$ make HASH=sha512 -j6
$ PATH=bin-wrappers:..
$ git init && echo hi > foo && git add foo && git commit -m "bang"
Initialized empty Git repository in /.../.git/
[master (root-commit) 
8d3ae658dff0c6e398bb4a0d193974e49acfadedfcd61daca42c931ac18d5ac46f0a068e08d81c25d7b79b1c3f4951e4340eeb90f0ef39de355c9bab7e75faba] 
bang
  1 files changed, 1 insertions(+), 0 deletions(-)
    create mode 100644 foo


1. (Please use the hash-v0.0.1 tag, I rebase.)
   gitweb: http://preview.tinyurl.com/bsufh92
   git://git.goeswhere.com/git/git.git
   https://github.com/FauxFaux/git/tree/hash-v0.0.1

---
Chris West (Faux)
Freenode #git: FauxFaux
https://ssl.goeswhere.com/key-transition-2011-10-10.txt.asc
gpg: 408A E4F1 4EA7 33EF 1265  82C1 B195 E1C4 779B A9B2

^ permalink raw reply

* Re: hooks that do not consume stdin sometimes crash git with SIGPIPE
From: Joey Hess @ 2011-12-06  3:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Lars Wirzenius
In-Reply-To: <7vmxb6iim0.fsf@alter.siamese.dyndns.org>

[-- Attachment #1: Type: text/plain, Size: 737 bytes --]

Junio C Hamano wrote:
> Why do you have a hook that is expected to read from receive-pack that
> does _not_ read anything from it in the first place? If you do not care
> about the update status given to pre-receive, shouldn't you be using the
> update hook and ignoring the command line parameters instead?

My hook *does* consume the stdin in one case, but in another case it
does no checks and so can immediately exit. 

Also, I didn't want it to be run once per updated ref as the update hook
is, since the tests it performs are rather expensive -- loading a perl
wiki engine in order to check that the changeset contains only changes to
wiki pages that are allowed based on the wiki's configuration.

-- 
see shy jo

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* Re: Suggestion on hashing
From: Bill Zaumen @ 2011-12-06  3:47 UTC (permalink / raw)
  To: Chris West (Faux); +Cc: Nguyen Thai Ngoc Duy, Jeff King, Git Mailing List
In-Reply-To: <alpine.DEB.2.00.1112060146121.15104@hoki.goeswhere.com>

When I went through the code, I noted that SHA-1 hashes are
currently used for the following:

   * object IDs
   * authentication (something to sign using public-key encryption)
   * data integrity (basically a really good checksum).

While there are lot of 20-byte arrays of unsigned char, many of those
are associated with lookups.  You might want to look at the
number of places that git_SHA1_Init is called (there aren't all that
many of those, and that function indicates the points where SHA-1
hashes are being created).

While a few things I tried were complete false starts (kept those
out of the preliminary patches I sent), I managed to store
a CRC (which you can treat as a place-holder for a real message
digest) for each SHA-1 hash in a pack file, but I did it by
creating a separate file (extension ".mds") and that worked.
I looked into modifying pack files, and that was too messy given
that you'd want older version to still work with newer remote
repositories.  The other factor is that the "mds" files are
computed locally, and at the same time that you create an "idx" file.
The formats of the "pack" and "idx" files don't change.

I've just started on replacing the CRC I used with real message
digests, making new digests easy to add. The plan is to initially
make it work with both a CRC and SHA-1 (the CRC so I can test it
easily by comparing new and old versions to show that nothing
changed when it shouldn't have), and because Git already implements
SHA-1.

I should complete my changes.  If we are lucky, maybe the changes I'm
trying would solve some of the problems you mentioned with pack files.
At least I can store the digests in a way that doesn't break the log
and fsck operations (it went through all the test suites, with only
minor modifications for things like counting the number of files in
particular directories).

If you make changes to commit objects, fixing the test scripts is a 
pain - there are a number of places where SHA-1 values are hard-
coded, and those have to be replaced.

Bill

On Tue, 2011-12-06 at 01:56 +0000, Chris West (Faux) wrote:
> Nguyen Thai Ngoc Duy wrote:
> > SHA-1 charateristics (like 20 byte length) are hard coded everywhere
> > in git, it'd be a big audit.
> 
> I was planning to look at this anyway.  My branch[1] allows
>   init/add/commit with SHA-256, SHA-512 and all the SHA-3 candidates.
> 
> log/fsck/etc. are all broken.  Don't even dare try packs.  Fixing things
>   is painful but not impossible.  I'm not convinced the task is even
>   remotely insurmountable.
> 
> (This is not a request-for-comments, just an informational notification.
>   It does not even attempt to address compatability or the like.)

^ permalink raw reply

* [PATCH] enable SO_KEEPALIVE for connected TCP sockets
From: Eric Wong @ 2011-12-06  4:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Sockets may never receive notification of some link errors,
causing "git fetch" or similar processes to hang forever.
Enabling keepalive messages allows hung processes to error out
after a few minutes/hours depending on the keepalive settings of
the system.

This is a problem noticed when running non-interactive
cronjobs to mirror repositories using "git fetch".

Signed-off-by: Eric Wong <normalperson@yhbt.net>
---
 connect.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/connect.c b/connect.c
index 51990fa..d0f59ef 100644
--- a/connect.c
+++ b/connect.c
@@ -175,6 +175,15 @@ static void get_host_and_port(char **host, const char **port)
 	}
 }
 
+static void enable_keepalive(int sockfd)
+{
+	int ka = 1;
+
+	if (setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &ka, sizeof(ka)) < 0)
+		fprintf(stderr, "unable to set SO_KEEPALIVE on socket: %s\n",
+			strerror(errno));
+}
+
 #ifndef NO_IPV6
 
 static const char *ai_name(const struct addrinfo *ai)
@@ -239,6 +248,8 @@ static int git_tcp_connect_sock(char *host, int flags)
 	if (sockfd < 0)
 		die("unable to connect to %s:\n%s", host, error_message.buf);
 
+	enable_keepalive(sockfd);
+
 	if (flags & CONNECT_VERBOSE)
 		fprintf(stderr, "done.\n");
 
@@ -312,6 +323,8 @@ static int git_tcp_connect_sock(char *host, int flags)
 	if (sockfd < 0)
 		die("unable to connect to %s:\n%s", host, error_message.buf);
 
+	enable_keepalive(sockfd);
+
 	if (flags & CONNECT_VERBOSE)
 		fprintf(stderr, "done.\n");
 
-- 
Eric Wong

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox