Git development

Git development
 help / color / mirror / Atom feed

* Re: [RFC] git rm -u
From: Junio C Hamano @ 2013-01-21  9:23 UTC (permalink / raw)
  To: Piotr Krukowiecki
  Cc: Matthieu Moy, Jonathan Nieder, Eric James Michael Ritz,
	Git Mailing List, Tomas Carnecky
In-Reply-To: <CAA01Csrv26WrrJDAo-1cr+rW6rYFGQZpYgtafEh=Wgtzswdv_g@mail.gmail.com>

Piotr Krukowiecki <piotr.krukowiecki@gmail.com> writes:

> Do you mean "git add" will be disallowed without "." or ":/" argument?
> Or will this change in future and "git add" without argument will me
> "whole tree", same as ":/" ?

No.  This is only about "git add -u<RETURN>", not any other forms of
"git add ...with or without other args...".

"git add -u<RETURN>" historically meant, and it still means, to
"update the index with every change in the working tree", even when
you are in a subdirectory.

Back when "git add -u" was invented, we didn't have the ":/", which
lets us tell commands that take pathspecs "I want everything from
the top of the working tree.".  If "git add -u<RETURN>" limited its
operation to the current directory, after working everywhere in the
working tree, cd'ing around and ending up to be in a subdirectory
somwhere deep, you had to "cd ../../.. && git add -u", which was
cumbersome.  If "git add -u" always meant the whole tree, limiting
it to the current directory with "git add -u .<RETURN>" was easy,
and that is why the default was chosen to the "whole tree".

Because we have ":/" these days, changing something that limits its
action to the current directory by default to instead work on the
whole tree no longer makes much sense.  That is, if we _were_ to
change "git add -u<RETURN>", it would be in the opposite direction,
i.e. to update the index only with the paths below the current
directory.

Such a change has to be done carefully.  Existing users do expect
the current behaviour, so we have to first _break_ their fingers and
habits and train them to say "add -u :/" when they mean the whole
tree operation.  Silently accepting "add -u" and changing its
meaning to update the index only with the paths below the current
directory will cause them trouble by leaving changes they _thought_
they added out of the index, and is an unacceptable change.

The first step of migration is "git add -u<RETURN>" that loudly
warns, so that uses of that form in scripts are updated before the
second step to avoid a flag-day breakage and start traing fingers
and habits of the users.

The second step is to make "add -u<RETURN>" fail, again with a
message that tells users to be explicit and add ":/" or "." at the
end if they mean "the whole tree" or "the current directory".

After keeping Git in that secnd step for sufficiently long time to
train users to type ":/" or "." explicitly, we can then finally
switch the default of "git add -u<RETURN>" to limit it to the
current directory, instead of failing the command.

^ permalink raw reply

* Re: git-cvsimport-3 and incremental imports
From: John Keeping @ 2013-01-21  9:36 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: git
In-Reply-To: <20130120232008.GA25001@thyrsus.com>

On Sun, Jan 20, 2013 at 06:20:08PM -0500, Eric S. Raymond wrote:
> John Keeping <john@keeping.me.uk>:
>> I don't think there is any way to solve this without giving cvsps more
>> information, probably the last commit time for all git branches, but
>> perhaps I'm missing a fast-import feature that can help solve this
>> problem.
> 
> Yes, you are.  The magic incantation is
> 
>     from refs/heads/<branch>^0
> 
> I've just pushed a cvsps-3.9 with an -i option that generates these at
> each branch root.  Combine it with -d and you get incremental
> fast-export.

I don't think this is enough.  I made a very similar change here for
testing (conditional on relative_date_start instead of a new flag) and I
needed this as well in order to prevent empty duplicate commits being
added:

-- >8 --

diff --git a/cvsps.c b/cvsps.c
index fb6a3ad..5771462 100644
--- a/cvsps.c
+++ b/cvsps.c
@@ -1560,7 +1560,7 @@ static bool visible(PatchSet * ps)
  ok:
     //fprintf(stderr, "Time check: %zd %zd %zd\n", restrict_date_start, restrict_date_end, ps->date);
     if (restrict_date_start > 0 &&
-	(ps->date < restrict_date_start ||
+	(ps->date <= restrict_date_start ||
 	 (restrict_date_end > 0 && ps->date > restrict_date_end)))
 	return false;

-- 8< --

But this is nothing more than a sticking plaster that happens to do
enough in this particular case - if the Git repository happened to be on
a different branch, the start date would be wrong and too many or too
few commits could be output.  Git doesn't detect that they commits are
identical to some that we already have because we're explicitly telling
it to make a new commit with the specified parent.

You can easily see the breakage by running the tests in the Git tree,
where the CVS revision map tests fail because they end up with duplicate
versions.

You'll need my cvsimport-3 branch to see these failures as it adds the
"git config" support that the tests rely on:

    git://github.com/johnkeeping/git.git cvsimport-3

John

^ permalink raw reply related

* Re: git interactive rebase 'consume' command
From: Michael Haggerty @ 2013-01-21 11:05 UTC (permalink / raw)
  To: Stephen Kelly; +Cc: git, Junio C Hamano, Jeff King
In-Reply-To: <kdgtir$apt$1@ger.gmane.org>

On 01/20/2013 03:05 PM, Stephen Kelly wrote:
> I find the fixup command during an interactive rebase useful.
> 
> Sometimes when cleaning up a branch, I end up in a situation like this:
> 
>  pick 07bc3c9 Good commit.
>  pick 1313a5e Commit to fixup into c2f62a3.
>  pick c2f62a3 Another commit.
> 
> 
> So, I have to reorder the commits, and change 1313a5e to 'f'. An alternative 
> would be to squash 's' c2f62a3 into 1313a5e and clean up the commit message. 
> The problem with that is it ends up with the wrong author time information.

I do "squash with successor then clean up commit message" all the time.
 I had never worried (or even thought much) about the author time of the
resulting commit.  I think I will continue not worrying about it :-)

I think it would be great to have a shorthand for this operation in "git
rebase --interactive" and I probably would have implemented it when I
added "fixup" if I had been able to think of a good name for it.  Even
though I do this sort of thing less frequently than "fixup", it still
comes up often enough that a special command for it would be useful.

> So, I usually reorder and then fixup, but that can also be problematic if I 
> get a conflict during the re-order (which is quite likely).

It is perverse to have to turn a well-defined and manifestly
conflict-free wish into one that has a good chance of conflicting, just
because of a limitation of the tool.

> I would prefer to be able to mark a commit as 'should be consumed', so that:
> 
>  pick 07bc3c9 Good commit.
>  consume 1313a5e Commit to fixup into c2f62a3.
>  pick c2f62a3 Another commit.
> 
> will result in 
> 
>  pick 07bc3c9 Good commit.
>  pick 62a3c2f Another commit.
> 
> directly.

Excellent.  But the name is not self-explanatory.  And there is
something different about your "consume" command:

Normally, "pick" means that the commit on that line is the start of a
new commit unrelated to its predecessors.  And in general, the command
on one line only affects the lines that come before it, not the lines
that come after it.  Under your proposal "consume" would change the
meaning of the following line, namely by changing what its "pick" means.
 It might be more consistent to require the following line to be changed
to "squash":

    pick 07bc3c9 Good commit.
    consume 1313a5e Commit to fixup into c2f62a3.
    squash c2f62a3 Another commit.

in which case the meaning of "consume" would be something like "pick
this commit but not its commit message.  There would have to be a
prohibition against generating commits with *no* commit messages, to
prevent series like [consume, pick] or [consume, fix, pick] while
allowing series like [consume, consume, squash, fix, fix].

If this is the interpretation, the name "quiet/q" might make things clearer.

Yet another approach would be to allow options on the commands.  For
example,

    pick 07bc3c9 Good commit.
    pick --quiet 1313a5e Commit to fixup into c2f62a3.
    squash c2f62a3 Another commit.

In fact if options were implemented, then "fixup" would mean the same as
"squash --quiet", "reword" could be written "pick --edit", and I'm sure
the new flexibility would make it easier to add other features (e.g.,
"pick --reset-author").

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply

* Re: git-cvsimport-3 and incremental imports
From: Eric S. Raymond @ 2013-01-21 11:28 UTC (permalink / raw)
  To: John Keeping; +Cc: git
In-Reply-To: <20130121093658.GD7498@serenity.lan>

John Keeping <john@keeping.me.uk>:
> But this is nothing more than a sticking plaster that happens to do
> enough in this particular case

I'm beginning to think that's the best outcome we ever get in this
problem domain...

>                    - if the Git repository happened to be on
> a different branch, the start date would be wrong and too many or too
> few commits could be output.  Git doesn't detect that they commits are
> identical to some that we already have because we're explicitly telling
> it to make a new commit with the specified parent.

Then I don't understand the actual failure case.  Either that or you
don't understand the effect of -i. Have you actually experimented with
it?  The reason I suspect you don't understand the feature is that it
shouldn't make any difference to the way -i works which repository branch is
active at the time of the second import.

Here is how I model what is going on:

1. We make commits to multiple branches of a CVS repo up to some given time T.

2. We import it, ending up with a collection of git branches all of which 
   have tip commits dated T or earlier. And *every* commit dated T or earlier
   gets copied over.

3. We make more commits to the same set of branches in CVS.

4. We now run cvsps -d T on the repo. This generates an incremental
   fast-import stream describing all CVS commits *newer* than T (see
   the cvsps manual page).

5. That stream should consist of a set of disconnected branches, each
   (because of -i) beginning with a root commit containing "from
   refs/heads/foo^0" which says to parent the commit on the tip of
   branch foo, whatever that happens to be.  (I don't have to guess
   about this, I tested the feature before shipping.)

6. Now, when git fast-import interprets that stream in the context of
   the repository produced in step 2, for each branch in the
   incremental dump the branch root commit is parented on the tip
   commit of the same branch in the repo.

At step 6, it shouldn't matter at all which branch is active, because
where an incremental branch root gets attached has nothing to do with
which branch is active. 

It is sufficient to avoid duplicate commits that cvsps -d 0 -d T and
cvsps -d T run on the same CVS repo operate on *disjoint sets* of CVS
file commits.  I can see this technique possibly getting confused if T
falls in the middle of a changeset where the CVS timestamps for the
file commits are out of order.  But that's the same case that will
fail if we're importing at file-commit granularity, so there's no new
bug here.

Can you explain at what step my logic is incorrect?
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

^ permalink raw reply

* [PATCH v2] unpack-trees: do not abort when overwriting an existing file with the same content
From: Nguyễn Thái Ngọc Duy @ 2013-01-21 11:40 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1358594648-26851-1-git-send-email-pclouds@gmail.com>


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Changes: do not lose worktree's executable permission.

 t/t1011-read-tree-sparse-checkout.sh |  3 ++-
 t/t2021-checkout-overwrite.sh        | 18 ++++++++++++++++++
 unpack-trees.c                       | 31 +++++++++++++++++++++++++++++++
 3 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/t/t1011-read-tree-sparse-checkout.sh b/t/t1011-read-tree-sparse-checkout.sh
index 5c0053a..38f9899 100755
--- a/t/t1011-read-tree-sparse-checkout.sh
+++ b/t/t1011-read-tree-sparse-checkout.sh
@@ -238,7 +238,8 @@ test_expect_success 'print errors when failed to update worktree' '
 	echo sub >.git/info/sparse-checkout &&
 	git checkout -f init &&
 	mkdir sub &&
-	touch sub/added sub/addedtoo &&
+	echo modified >sub/added &&
+	echo modified >sub/addedtoo &&
 	test_must_fail git checkout top 2>actual &&
 	cat >expected <<\EOF &&
 error: The following untracked working tree files would be overwritten by checkout:
diff --git a/t/t2021-checkout-overwrite.sh b/t/t2021-checkout-overwrite.sh
index 5da63e9..bb1696d 100755
--- a/t/t2021-checkout-overwrite.sh
+++ b/t/t2021-checkout-overwrite.sh
@@ -47,4 +47,22 @@ test_expect_success SYMLINKS 'checkout commit with dir must not remove untracked
 	test -h a/b
 '
 
+test_expect_success 'do not abort on overwriting an existing file with the same content' '
+	echo abc >bar &&
+	git add bar &&
+	git commit -m "new file" &&
+	git reset HEAD^ &&
+	git checkout HEAD@{1}
+'
+
+test_expect_success POSIXPERM 'do abort on an existing file, same content but different permission' '
+	git checkout -f HEAD^ &&
+	echo abc >bar &&
+	git add bar &&
+	git commit -m "new file" &&
+	git reset HEAD^ &&
+	chmod a+x bar &&
+	test_must_fail git checkout HEAD@{1}
+'
+
 test_done
diff --git a/unpack-trees.c b/unpack-trees.c
index 0e1a196..ea204ae 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -9,6 +9,8 @@
 #include "refs.h"
 #include "attr.h"
 
+#define SAME_CONTENT_SIZE_LIMIT (1024 * 1024)
+
 /*
  * Error messages expected by scripts out of plumbing commands such as
  * read-tree.  Non-scripted Porcelain is not required to use these messages
@@ -1363,6 +1365,7 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 			      struct unpack_trees_options *o)
 {
 	struct cache_entry *result;
+	unsigned long ce_size;
 
 	/*
 	 * It may be that the 'lstat()' succeeded even though
@@ -1405,6 +1408,34 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 			return 0;
 	}
 
+	/*
+	 * If it has the same content that we are going to overwrite,
+	 * there's no point in complaining. We still overwrite it in
+	 * the end though.
+	 */
+	if (ce &&
+	    S_ISREG(st->st_mode) && S_ISREG(ce->ce_mode) &&
+	    (!trust_executable_bit ||
+	     (0100 & (ce->ce_mode ^ st->st_mode)) == 0) &&
+	    st->st_size < SAME_CONTENT_SIZE_LIMIT &&
+	    sha1_object_info(ce->sha1, &ce_size) == OBJ_BLOB &&
+	    ce_size == st->st_size) {
+		void *buffer = NULL;
+		unsigned long size;
+		enum object_type type;
+		struct strbuf sb = STRBUF_INIT;
+		int matched =
+			strbuf_read_file(&sb, ce->name, ce_size) == ce_size &&
+			(buffer = read_sha1_file(ce->sha1, &type, &size)) != NULL &&
+			type == OBJ_BLOB &&
+			size == ce_size &&
+			!memcmp(buffer, sb.buf, size);
+		free(buffer);
+		strbuf_release(&sb);
+		if (matched)
+			return 0;
+	}
+
 	return o->gently ? -1 :
 		add_rejected_path(o, error_type, name);
 }
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* Re: git-cvsimport-3 and incremental imports
From: John Keeping @ 2013-01-21 12:00 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: git
In-Reply-To: <20130121112853.GA31693@thyrsus.com>

On Mon, Jan 21, 2013 at 06:28:53AM -0500, Eric S. Raymond wrote:
> John Keeping <john@keeping.me.uk>:
>> But this is nothing more than a sticking plaster that happens to do
>> enough in this particular case
> 
> I'm beginning to think that's the best outcome we ever get in this
> problem domain...

I don't think we can ever get a perfect outcome, but it should be
possible to do a little bit better without too much effort.

>>                    - if the Git repository happened to be on
>> a different branch, the start date would be wrong and too many or too
>> few commits could be output.  Git doesn't detect that they commits are
>> identical to some that we already have because we're explicitly telling
>> it to make a new commit with the specified parent.
> 
> Then I don't understand the actual failure case.  Either that or you
> don't understand the effect of -i. Have you actually experimented with
> it?  The reason I suspect you don't understand the feature is that it
> shouldn't make any difference to the way -i works which repository branch is
> active at the time of the second import.
> 
> Here is how I model what is going on:
> 
> 1. We make commits to multiple branches of a CVS repo up to some given time T.
> 
> 2. We import it, ending up with a collection of git branches all of which 
>    have tip commits dated T or earlier. And *every* commit dated T or earlier
>    gets copied over.
>
> 3. We make more commits to the same set of branches in CVS.
> 
> 4. We now run cvsps -d T on the repo. This generates an incremental
>    fast-import stream describing all CVS commits *newer* than T (see
>    the cvsps manual page).

This is the problem step.  There are two scenarios that have problems:

1. If I create a new development branch in my Git repository and commit
   something to it then git-cvsimport-3 will pass a time to cvsps that
   is newer than the actual time of the last import, so T is wrong.

   It may be possible to fix this case purely in git-cvsimport-3.

2. If the branch I have checked out is not the newest CVS branch, then
   git-cvsimport-3 will pass a value of T that is before the time of the
   last import.  This case is more subtle but it results in unwanted
   duplicate commits since git-fast-import will just do what it's told
   and create the new commits.

   So if we have the following commits:

     commit1 at time 1
     commit2 at time 2
     commit3 at time 3

   and I call "cvsps -d 2 -i" I end up with the series:

     commit1 at time 1
     commit2 at time 2
     commit3 at time 3
     commit2 at time 2 - effectively reverting the previous commit
     commit3 at time 3 - a duplicate
     ... and potentially genuinely new commits

   This is demonstrated by running the Git test t9650.

I also disagree that cvsps outputs commits *newer* than T since it will
also output commits *at* T, which is what I changed with the patch in my
previous message.  This fixes the duplicate commit2 in the series above,
but not the duplicate commit3.

> 5. That stream should consist of a set of disconnected branches, each
>    (because of -i) beginning with a root commit containing "from
>    refs/heads/foo^0" which says to parent the commit on the tip of
>    branch foo, whatever that happens to be.  (I don't have to guess
>    about this, I tested the feature before shipping.)
> 
> 6. Now, when git fast-import interprets that stream in the context of
>    the repository produced in step 2, for each branch in the
>    incremental dump the branch root commit is parented on the tip
>    commit of the same branch in the repo.
>  
> At step 6, it shouldn't matter at all which branch is active, because
> where an incremental branch root gets attached has nothing to do with
> which branch is active. 
> 
> It is sufficient to avoid duplicate commits that cvsps -d 0 -d T and
> cvsps -d T run on the same CVS repo operate on *disjoint sets* of CVS
> file commits.  I can see this technique possibly getting confused if T
> falls in the middle of a changeset where the CVS timestamps for the
> file commits are out of order.  But that's the same case that will
> fail if we're importing at file-commit granularity, so there's no new
> bug here.
> 
> Can you explain at what step my logic is incorrect?

Your logic is correct - for cvsps - the problem is where T comes from.

Perhaps it is simplest to just save a CVS_LAST_IMPORT_TIME file in
$GIT_DIR and not worry about it any more.


John

^ permalink raw reply

* [RFC/PATCH] add: warn when -u or -A is used without filepattern
From: Matthieu Moy @ 2013-01-21 12:00 UTC (permalink / raw)
  To: git, gitster
  Cc: Jonathan Nieder, Eric James Michael Ritz, Tomas Carnecky,
	Matthieu Moy
In-Reply-To: <7v1udfn0tm.fsf@alter.siamese.dyndns.org>

Most git commands that can be used with our without a filepattern are
tree-wide by default, the filepattern being used to restrict their scope.
A few exceptions are: 'git grep', 'git clean', 'git add -u' and 'git add -A'.

The inconsistancy of 'git add -u' and 'git add -A' are particularly
problematic since other 'git add' subcommands (namely 'git add -p' and
'git add -e') are tree-wide by default.

Flipping the default now is unacceptable, so this patch starts training
users to type explicitely 'git add -u|-A :/' or 'git add -u|-A .', to prepare
for the next steps:

* forbid 'git add -u|-A' without filepattern (like 'git add' without
  option)

* much later, maybe, re-allow 'git add -u|-A' without filepattern, with a
  tree-wide scope.

A nice side effect of this patch is that it makes the :/ special
filepattern easier to discover for users.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
---
Junio C Hamano <gitster@pobox.com> writes:

> The first step of "git add -u" migration plan would be to warn when
> no argument is given and update all the existing index entries, and
> give the same advise to use either "." or ":/".  Keep this for three
> cycles: 3 * (8 to 10 weeks per cycle) = 27 weeks ~ 1/2 year.

The first step should look like this patch. The message remains vague
about the next steps ("change in a future Git version", no mention of
the exact change nor of the exact version in which it will happen),
but I'm fine with refining it (perhaps this could be a 2.0 change,
like the change to push.default?).

 Documentation/git-add.txt |  7 ++++---
 builtin/add.c             | 30 +++++++++++++++++++++++++++++-
 2 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index fd9e36b..5333559 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -107,9 +107,10 @@ apply to the index. See EDITING PATCHES below.
 	from the index if the corresponding files in the working tree
 	have been removed.
 +
-If no <filepattern> is given, default to "."; in other words,
-update all tracked files in the current directory and its
-subdirectories.
+If no <filepattern> is given, the current version of Git defaults to
+"."; in other words, update all tracked files in the current directory
+and its subdirectories. This default will change in a future version
+of Git, hence the form without <filepattern> should not be used.
 
 -A::
 --all::
diff --git a/builtin/add.c b/builtin/add.c
index e664100..e6eb829 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -373,6 +373,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 	int add_new_files;
 	int require_pathspec;
 	char *seen = NULL;
+	const char *option_with_implicit_dot = NULL;
 
 	git_config(add_config, NULL);
 
@@ -392,7 +393,34 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		die(_("-A and -u are mutually incompatible"));
 	if (!show_only && ignore_missing)
 		die(_("Option --ignore-missing can only be used together with --dry-run"));
-	if ((addremove || take_worktree_changes) && !argc) {
+	if (addremove)
+		option_with_implicit_dot = "--all";
+	if (take_worktree_changes)
+		option_with_implicit_dot = "--update";
+	if (option_with_implicit_dot && !argc) {
+		/*
+		 * To be consistant with "git add -p" and most Git
+		 * commands, we should default to being tree-wide, but
+		 * this is not the original behavior and can't be
+		 * changed until users trained themselves not to type
+		 * "git add -u" or "git add -A". For now, we warn and
+		 * keep the old behavior. Later, this warning can be
+		 * turned into a die(...), and eventually we may
+		 * reallow the command with a new behavior.
+		 */
+		warning(_("The behavior of 'git add %s' with no path argument will change in a future\n"
+			  "Git version and shouldn't be used anymore.  To add content for the whole tree, run:\n"
+			  "\n"
+			  "  git add %s :/\n"
+			  "\n"
+			  "To restrict the command to the current directory, run:\n"
+			  "\n"
+			  "  git add %s .\n"
+			  "\n"
+			  "With the current Git version, the command is restricted to the current directory."),
+			option_with_implicit_dot,
+			option_with_implicit_dot,
+			option_with_implicit_dot);
 		static const char *here[2] = { ".", NULL };
 		argc = 1;
 		argv = here;
-- 
1.8.0.319.g8abfee4

^ permalink raw reply related

* Re: git-cvsimport-3 and incremental imports
From: Eric S. Raymond @ 2013-01-21 12:43 UTC (permalink / raw)
  To: John Keeping; +Cc: git
In-Reply-To: <20130121120010.GE7498@serenity.lan>

John Keeping <john@keeping.me.uk>:
> I also disagree that cvsps outputs commits *newer* than T since it will
> also output commits *at* T, which is what I changed with the patch in my
> previous message.

Ah.  OK, that is yet another bug inherited from 2.x - the code doesn't
match the documented (and correct) behavior.  Please send me a patch
against the cvsps repo, I'll merge it.

> Perhaps it is simplest to just save a CVS_LAST_IMPORT_TIME file in
> $GIT_DIR and not worry about it any more.

Yes, I think you're right. Trying to carry that information in-band would
probably doom us to all sorts of bug-prone complications.

Thanks for the good analysis.  I wish everybody I had to chase bugs with
could explain them with such clarity and concision.

Sigh. Now I have to figure out if cvsps's behavior can be rescued in Chris
Rorvick's recently-discovered failure case. I'm not optimistic.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

^ permalink raw reply

* [PATCH 1/2] Update :/abc ambiguity check
From: Nguyễn Thái Ngọc Duy @ 2013-01-21 13:00 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

:/abc may mean two things:

- as a revision, it means the revision that has "abc" in commit
  message.

- as a pathpec, it means "abc" from root.

Currently we see ":/abc" as a rev (most of the time), but never see it
as a pathspec even if "abc" exists and "git log :/abc" will gladly
take ":/abc" as rev even it's ambiguous. This patch makes it:

- ambiguous when "abc" exists on worktree
- a rev if abc does not exist on worktree
- a path if abc is not found in any commits (although better use
  "--" to avoid ambiguation because searching through commit DAG is
  expensive)

A plus from this patch is, because ":/" never matches anything as a
rev, it is never considered a valid rev and because root directory
always exists, ":/" is always unambiguously seen as a pathspec.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 This makes "git grep foo :/" work. For such an often used command,
 the less I type the better.

 The ".. is expensive" thing is not cool. But it hopefully does not
 happen often. I expect people rely on shell's tab completion more
 than :/typing-path-to-somewhere-manually. Copy/paste happens more
 often. I usually type ":" then copy a path from diff --git line.

 We should probably kill ":/abc as a rev" and replace it with @{/abc}
 or something less ambiguous.

 setup.c                       |  9 ++++++++-
 t/t4208-log-magic-pathspec.sh | 17 +++++++++++++++--
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/setup.c b/setup.c
index f108c4b..47acc11 100644
--- a/setup.c
+++ b/setup.c
@@ -66,7 +66,14 @@ int check_filename(const char *prefix, const char *arg)
 	const char *name;
 	struct stat st;
 
-	name = prefix ? prefix_filename(prefix, strlen(prefix), arg) : arg;
+	if (!prefixcmp(arg, ":/")) {
+		if (arg[2] == '\0') /* ":/" is root dir, always exists */
+			return 1;
+		name = arg + 2;
+	} else if (prefix)
+		name = prefix_filename(prefix, strlen(prefix), arg);
+	else
+		name = arg;
 	if (!lstat(name, &st))
 		return 1; /* file exists */
 	if (errno == ENOENT || errno == ENOTDIR)
diff --git a/t/t4208-log-magic-pathspec.sh b/t/t4208-log-magic-pathspec.sh
index 2c482b6..72300b5 100755
--- a/t/t4208-log-magic-pathspec.sh
+++ b/t/t4208-log-magic-pathspec.sh
@@ -11,11 +11,24 @@ test_expect_success 'setup' '
 	mkdir sub
 '
 
-test_expect_success '"git log :/" should be ambiguous' '
-	test_must_fail git log :/ 2>error &&
+test_expect_success '"git log :/" should not be ambiguous' '
+	git log :/
+'
+
+test_expect_success '"git log :/a" should be ambiguous (applied both rev and worktree)' '
+	: >a &&
+	test_must_fail git log :/a 2>error &&
 	grep ambiguous error
 '
 
+test_expect_success '"git log :/a -- " should not be ambiguous' '
+	git log :/a --
+'
+
+test_expect_success '"git log -- :/a" should not be ambiguous' '
+	git log -- :/a
+'
+
 test_expect_success '"git log :" should be ambiguous' '
 	test_must_fail git log : 2>error &&
 	grep ambiguous error
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH 2/2] grep: avoid accepting ambiguous revision
From: Nguyễn Thái Ngọc Duy @ 2013-01-21 13:00 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy
In-Reply-To: <1358773249-24384-1-git-send-email-pclouds@gmail.com>


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/grep.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index 0e1b6c8..8025964 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -823,6 +823,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			struct object *object = parse_object(sha1);
 			if (!object)
 				die(_("bad object %s"), arg);
+			if (!seen_dashdash)
+				verify_non_filename(prefix, arg);
 			add_object_array(object, arg, &list);
 			continue;
 		}
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* Re: git-cvsimport-3 and incremental imports
From: John Keeping @ 2013-01-21 13:27 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: git
In-Reply-To: <20130121124340.GA32219@thyrsus.com>

On Mon, Jan 21, 2013 at 07:43:40AM -0500, Eric S. Raymond wrote:
> John Keeping <john@keeping.me.uk>:
>> I also disagree that cvsps outputs commits *newer* than T since it will
>> also output commits *at* T, which is what I changed with the patch in my
>> previous message.
> 
> Ah.  OK, that is yet another bug inherited from 2.x - the code doesn't
> match the documented (and correct) behavior.  Please send me a patch
> against the cvsps repo, I'll merge it.

Should now be in your inbox.

> > Perhaps it is simplest to just save a CVS_LAST_IMPORT_TIME file in
> > $GIT_DIR and not worry about it any more.
> 
> Yes, I think you're right. Trying to carry that information in-band would
> probably doom us to all sorts of bug-prone complications.

I think the only way to do it without needing to save local state in the
Git repository would be to teach cvsps to read a table of refs and times
from its stdin so that we could do something like:

    git for-each-ref --format='%(refname)%09%(*authordate:raw)' refs/heads/ |
    cvsps -i --branch-times-from-stdin |
    git fast-import

Then cvsps could create a hash table from this and use that to decide
whether a patch set is interesting or not.


John

^ permalink raw reply

* Re: GIT get corrupted on lustre
From: Erik Faye-Lund @ 2013-01-21 13:29 UTC (permalink / raw)
  To: Eric Chamberland
  Cc: Pyeron, Jason J CTR (US), Maxime Boissonneault, Philippe Vaucher,
	git@vger.kernel.org, Sébastien Boisvert
In-Reply-To: <50F98B53.9080109@giref.ulaval.ca>

On Fri, Jan 18, 2013 at 6:50 PM, Eric Chamberland
<Eric.Chamberland@giref.ulaval.ca> wrote:
> Good idea!
>
> I did a strace and here is the output with the error:
>
> http://www.giref.ulaval.ca/~ericc/strace_git_error.txt
>
> Hope it will be insightful!

This trace doesn't seem to contain child-processes, but instead having
their stderr inlined into the log. Try using "strace -f" instead...

^ permalink raw reply

* [PATCH] git-for-each-ref.txt: 'raw' is a supported date format
From: John Keeping @ 2013-01-21 13:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Commit 7dff9b3 (Support 'raw' date format) added a raw date format.
Update the git-for-each-ref documentation to include this.

Signed-off-by: John Keeping <john@keeping.me.uk>
---
 Documentation/git-for-each-ref.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index db55a4e..d3e1df7 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -117,7 +117,7 @@ returns an empty string instead.
 
 As a special case for the date-type fields, you may specify a format for
 the date by adding one of `:default`, `:relative`, `:short`, `:local`,
-`:iso8601` or `:rfc2822` to the end of the fieldname; e.g.
+`:iso8601`, `:rfc2822` or `raw` to the end of the fieldname; e.g.
 `%(taggerdate:relative)`.
 
 
-- 
1.8.1

^ permalink raw reply related

* Re: git-cvsimport-3 and incremental imports
From: Eric S. Raymond @ 2013-01-21 14:07 UTC (permalink / raw)
  To: John Keeping; +Cc: git
In-Reply-To: <20130121132706.GF7498@serenity.lan>

John Keeping <john@keeping.me.uk>:
> > Ah.  OK, that is yet another bug inherited from 2.x - the code doesn't
> > match the documented (and correct) behavior.  Please send me a patch
> > against the cvsps repo, I'll merge it.
> 
> Should now be in your inbox.

Received, merged, tested, and cvsps-3.10 has shipped.
 
> I think the only way to do it without needing to save local state in the
> Git repository would be to teach cvsps to read a table of refs and times
> from its stdin so that we could do something like:
> 
>     git for-each-ref --format='%(refname)%09%(*authordate:raw)' refs/heads/ |
>     cvsps -i --branch-times-from-stdin |
>     git fast-import
> 
> Then cvsps could create a hash table from this and use that to decide
> whether a patch set is interesting or not.

Agreed.  I considered implementing something quite this before thinking of
the ^0 hack.  But an out-of-band timestamp file is much simpler.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

^ permalink raw reply

* Re: [PATCH v3 0/2] Make git-svn work with gitdir links
From: Joachim Schmitz @ 2013-01-21 14:19 UTC (permalink / raw)
  To: git
In-Reply-To: <7vwqv7i9su.fsf@alter.siamese.dyndns.org>

Junio C Hamano wrote:
> Barry Wardell <barry.wardell@gmail.com> writes:
>
>> These patches fix a bug which prevented git-svn from working with
>> repositories which use gitdir links.
>>
>> Changes since v2:
>>  - Rebased onto latest master.
>>  - Added test case which verifies that the problem has been fixed.
>>  - Fixed problems with git svn (init|clone|multi-init).
>>  - All git-svn test cases now pass (except two in t9101 which also
>>    failed before these patches).
>>
>> Barry Wardell (2):
>>   git-svn: Add test for git-svn repositories with a gitdir link
>>   git-svn: Simplify calculation of GIT_DIR
>
> Thanks for your persistence ;-) As this is a pretty old topic, I'll
> give two URLs for people who are interested to view the previous
> threads:
>
>    http://thread.gmane.org/gmane.comp.version-control.git/192133
>    http://thread.gmane.org/gmane.comp.version-control.git/192127
>
> You would want to mark it as test_expect_failure in the first patch
> and then flip it to text_expect_success in the second patch where
> you fix the breakage?  Otherwise, after applying the first patch,
> the testsuite will break needlessly.

I'd just apply them the other way round, 1st fix the problem, 2nd add a test 
for it

Bye, Jojo 

^ permalink raw reply

* Re: [RFC/PATCH] add: warn when -u or -A is used without filepattern
From: Robin Rosenberg @ 2013-01-21 15:00 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jonathan Nieder, Eric James Michael Ritz, Tomas Carnecky, git,
	gitster
In-Reply-To: <1358769611-3625-1-git-send-email-Matthieu.Moy@imag.fr>



----- Ursprungligt meddelande -----
> Most git commands that can be used with our without a filepattern are
> tree-wide by default, the filepattern being used to restrict their
> scope.
> A few exceptions are: 'git grep', 'git clean', 'git add -u' and 'git
> add -A'.
> 
> The inconsistancy of 'git add -u' and 'git add -A' are particularly
> problematic since other 'git add' subcommands (namely 'git add -p'
> and
> 'git add -e') are tree-wide by default.
> 
> Flipping the default now is unacceptable, so this patch starts
> training
> users to type explicitely 'git add -u|-A :/' or 'git add -u|-A .', to
> prepare
> for the next steps:
> 
> * forbid 'git add -u|-A' without filepattern (like 'git add' without
>   option)

git add -u without filepattern is, I believe very common, so no noisy
output there please.

git diff
#looks good
git add -u

-- robin

^ permalink raw reply

* Re: [RFC/PATCH] add: warn when -u or -A is used without filepattern
From: Matthieu Moy @ 2013-01-21 15:16 UTC (permalink / raw)
  To: Robin Rosenberg
  Cc: Jonathan Nieder, Eric James Michael Ritz, Tomas Carnecky, git,
	gitster
In-Reply-To: <1217961884.4232967.1358780423428.JavaMail.root@dewire.com>

Robin Rosenberg <robin.rosenberg@dewire.com> writes:

> git add -u without filepattern is, I believe very common, so no noisy
> output there please.

What are you exactly suggesting? That we keep the inconsistant semantics
of "git add -u" or "git add -A"? Or another migration plan?

> git diff
> #looks good
> git add -u

That's indeed the kind of mistake I'd like to avoid. In your example,
"git diff" is tree-wide, and "git add -u" is limited to ., so in general
"git add -u" won't stage the same thing as "git diff" just showed.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply

* Re: [PATCH] git-for-each-ref.txt: 'raw' is a supported date format
From: Michael Haggerty @ 2013-01-21 16:02 UTC (permalink / raw)
  To: John Keeping; +Cc: Junio C Hamano, git
In-Reply-To: <d3a288a67867d7a60c9217a78bda42301392c3da.1358776352.git.john@keeping.me.uk>

On 01/21/2013 02:53 PM, John Keeping wrote:
> Commit 7dff9b3 (Support 'raw' date format) added a raw date format.
> Update the git-for-each-ref documentation to include this.
> 
> Signed-off-by: John Keeping <john@keeping.me.uk>
> ---
>  Documentation/git-for-each-ref.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
> index db55a4e..d3e1df7 100644
> --- a/Documentation/git-for-each-ref.txt
> +++ b/Documentation/git-for-each-ref.txt
> @@ -117,7 +117,7 @@ returns an empty string instead.
>  
>  As a special case for the date-type fields, you may specify a format for
>  the date by adding one of `:default`, `:relative`, `:short`, `:local`,
> -`:iso8601` or `:rfc2822` to the end of the fieldname; e.g.
> +`:iso8601`, `:rfc2822` or `raw` to the end of the fieldname; e.g.
>  `%(taggerdate:relative)`.

Shouldn't "raw" be preceded with a colon like the other format specifiers?

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply

* Re: GIT get corrupted on lustre
From: Thomas Rast @ 2013-01-21 16:11 UTC (permalink / raw)
  To: git@vger.kernel.org
  Cc: kusmabite, Eric Chamberland, Pyeron, Jason J CTR (US),
	Maxime Boissonneault, Philippe Vaucher, Sébastien Boisvert
In-Reply-To: <CABPQNSbJr4dR9mq+kCwGe-RKb9PA7q=SKzbFW+=md_PLzZh=nQ@mail.gmail.com>

Erik Faye-Lund <kusmabite@gmail.com> writes:

> On Fri, Jan 18, 2013 at 6:50 PM, Eric Chamberland
> <Eric.Chamberland@giref.ulaval.ca> wrote:
>> Good idea!
>>
>> I did a strace and here is the output with the error:
>>
>> http://www.giref.ulaval.ca/~ericc/strace_git_error.txt
>>
>> Hope it will be insightful!
>
> This trace doesn't seem to contain child-processes, but instead having
> their stderr inlined into the log. Try using "strace -f" instead...

I happen to have access to a lustre FS on the brutus cluster of ETH
Zurich, so I figured I could give it a shot.

What's odd is that while I cannot reproduce the original problem, there
seems to be another issue/bug with utime():

  $ strace -f -o ~/gc.trace git gc
  warning: failed utime() on .git/objects/68/tmp_obj_sCAEVc: Interrupted system call
  warning: failed utime() on .git/objects/a6/tmp_obj_3cdB2c: Interrupted system call
  warning: failed utime() on .git/objects/69/tmp_obj_lbU3Xc: Interrupted system call
  warning: failed utime() on .git/objects/c3/tmp_obj_EU97Wc: Interrupted system call
  warning: failed utime() on .git/objects/3e/tmp_obj_tb2j3c: Interrupted system call
  warning: failed utime() on .git/objects/15/tmp_obj_e6zMXc: Interrupted system call
  warning: failed utime() on .git/objects/54/tmp_obj_ExOJVc: Interrupted system call
  warning: failed utime() on .git/objects/e3/tmp_obj_GtPw4c: Interrupted system call
  warning: failed utime() on .git/objects/21/tmp_obj_Xex32c: Interrupted system call
  warning: failed utime() on .git/objects/1a/tmp_obj_CzwsZc: Interrupted system call
  warning: failed utime() on .git/objects/18/tmp_obj_o6fp3c: Interrupted system call
  warning: failed utime() on .git/objects/32/tmp_obj_Ih0G4c: Interrupted system call
  warning: failed utime() on .git/objects/41/tmp_obj_1RXV1c: Interrupted system call
  Counting objects: 137744, done.
  Delta compression using up to 48 threads.
  Compressing objects: 100% (36510/36510), done.
  Writing objects: 100% (137744/137744), done.
  Total 137744 (delta 101591), reused 135512 (delta 99472)

The trace is here (2.1MB compressed):

  http://thomasrast.ch/download/gc.trace.bz2

For the test I used a clone of another git.git I had around.  I think
the error is from sha1_file.c:2564.  While that doesn't look too
important (see ca11b212 for context), it does raise the question: what
other system calls that we never expect to EINTR can do this on
sufficiently arcane system/FS combinations?

Peff's test ran without any apparent issue for a few minutes.  I also
ran an extended version (at the end) that sets alarms, so as to actually
get interrupted.  That proved more interesting.  I had to fix verify()
and write_in_full() to account for EINTR in read()/write(), as those
seem likely to fail.  I also got link() to fail:

  $ ~/lustre-peff-reproducer 
  unable to create hard link: Interrupted system call
  unable to open index file: No such file or directory

but it took a long time.  Unfortunately, when running it with strace I
managed to panic the host I ran it on:

  $ strace -o ~/peff-reproducer.trace ~/lustre-peff-reproducer 

  Message from syslogd@brutus1 at Jan 21 17:09:43 ...                                                    
   kernel:LustreError: 37417:0:(osc_lock.c:1182:osc_lock_enqueue()) ASSERTION( ols->ols_state == OLS_NEW ) failed: Impossible state: 4

  Message from syslogd@brutus1 at Jan 21 17:09:43 ...
   kernel:LustreError: 37417:0:(osc_lock.c:1182:osc_lock_enqueue()) LBUG

  Message from syslogd@brutus1 at Jan 21 17:09:43 ...
   kernel:Kernel panic - not syncing: LBUG

Yay for now having to explain this to the cluster team.

I tried finding a standard that limits the syscalls to which EINTR
applies, without too much success.  I'm not sure how far I should trust
my manpages, but while some of them explicitly list EINTR as a possible
error (read, write, etc.) link() does not.  (And the linux manpages
agree with the POSOIX ones for once.)

If somebody finds such a standard, we could of course use it to blame
lustre instead :-)

In the absence of it, wouldn't we in theory have to write a simple
loop-on-EINTR wrapper for *all* syscalls?

Of course there's the added problem that when open(O_CREAT|O_EXCL) fails
with EINTR, it's hard to tell whether a file that may now exist is
indeed yours or some other process's.

--- 8< ----
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sys/time.h>
#include <signal.h>
#include <errno.h>

struct itimerval itv;

static int randomize(unsigned char *buf, int len)
{
  int i;
  len = rand() % len;
  for (i = 0; i < len; i++)
    buf[i] = rand() & 0xff;
  return len;
}

static int check_eof(int fd)
{
  int ch;
  int r = read(fd, &ch, 1);
  if (r < 0) {
    perror("read error after expected EOF");
    return -1;
  }
  if (r > 0) {
    fprintf(stderr, "extra byte after expected EOF");
    return -1;
  }
  return 0;
}

static int verify(int fd, const unsigned char *buf, int len)
{
  while (len) {
    char to_check[4096];
    int got = read(fd, to_check,
                   len < sizeof(to_check) ? len : sizeof(to_check));

    if (got < 0 && errno == EINTR)
      continue;
    if (got < 0) {
      perror("unable to read");
      return -1;
    }
    if (got == 0) {
      fprintf(stderr, "premature EOF (%d bytes remaining)", len);
      return -1;
    }
    if (memcmp(buf, to_check, got)) {
      fprintf(stderr, "bytes differ");
      return -1;
    }

    buf += got;
    len -= got;
  }

  return check_eof(fd);
}

int write_in_full(int fd, const unsigned char *buf, int len)
{
  while (len) {
    int r = write(fd, buf, len);
    if (r < 0 && errno == EINTR)
      continue;
    if (r < 0)
      return -1;
    buf += r;
    len -= r;
  }
  return 0;
}

int move_into_place(const char *old, const char *new)
{
  if (link(old, new) < 0) {
    perror("unable to create hard link");
    return 1;
  }
  unlink(old);
  return 0;
}

void handle_alarm(int signal)
{
}

int main(void)
{
  struct sigaction sa;

  sa.sa_handler = handle_alarm;
  sa.sa_flags = SA_RESTART;
  sigaction(SIGALRM, &sa, NULL);

  itv.it_interval.tv_sec = 0;
  itv.it_interval.tv_usec = 10000;
  itv.it_value.tv_sec = 0;
  itv.it_value.tv_usec = 100000;
  setitimer(ITIMER_REAL, &itv, NULL);

  while (1) {
    static unsigned char junk[1024*1024];
    int len = randomize(junk, sizeof(junk));
    int fd;

    /* clean up from any previous round */
    unlink("tmpfile");
    unlink("final.idx");

    fd = open("tmpfile", O_WRONLY|O_CREAT, 0666);
    if (fd < 0) {
      perror("unable to open tmpfile");
      return 1;
    }
    if (write_in_full(fd, junk, len) < 0 ||
        fsync(fd) < 0 ||
        close(fd) < 0) {
      perror("unable to write");
      return 1;
    }

    if (move_into_place("tmpfile", "final.idx") < 0)
      return 1;

    fd = open("final.idx", O_RDONLY);
    if (fd < 0) {
      perror("unable to open index file");
      return 1;
    }
    if (verify(fd, junk, len) < 0)
      return 1;
    close(fd);
  }
}

^ permalink raw reply

* Re: GIT get corrupted on lustre
From: Maxime Boissonneault @ 2013-01-21 16:14 UTC (permalink / raw)
  To: Thomas Rast
  Cc: git@vger.kernel.org, kusmabite, Eric Chamberland,
	Pyeron, Jason J CTR (US), Philippe Vaucher,
	Sébastien Boisvert
In-Reply-To: <87a9s2o6ri.fsf@pctrast.inf.ethz.ch>

Hi Thomas,
Can you tell me what is the version of the lustre servers and the lustre 
clients ?

Thanks,

Maxime Boissonneault
HPC specialist @ Calcul Québec.

Le 2013-01-21 11:11, Thomas Rast a écrit :
> Erik Faye-Lund <kusmabite@gmail.com> writes:
>
>> On Fri, Jan 18, 2013 at 6:50 PM, Eric Chamberland
>> <Eric.Chamberland@giref.ulaval.ca> wrote:
>>> Good idea!
>>>
>>> I did a strace and here is the output with the error:
>>>
>>> http://www.giref.ulaval.ca/~ericc/strace_git_error.txt
>>>
>>> Hope it will be insightful!
>> This trace doesn't seem to contain child-processes, but instead having
>> their stderr inlined into the log. Try using "strace -f" instead...
> I happen to have access to a lustre FS on the brutus cluster of ETH
> Zurich, so I figured I could give it a shot.
>
> What's odd is that while I cannot reproduce the original problem, there
> seems to be another issue/bug with utime():
>
>    $ strace -f -o ~/gc.trace git gc
>    warning: failed utime() on .git/objects/68/tmp_obj_sCAEVc: Interrupted system call
>    warning: failed utime() on .git/objects/a6/tmp_obj_3cdB2c: Interrupted system call
>    warning: failed utime() on .git/objects/69/tmp_obj_lbU3Xc: Interrupted system call
>    warning: failed utime() on .git/objects/c3/tmp_obj_EU97Wc: Interrupted system call
>    warning: failed utime() on .git/objects/3e/tmp_obj_tb2j3c: Interrupted system call
>    warning: failed utime() on .git/objects/15/tmp_obj_e6zMXc: Interrupted system call
>    warning: failed utime() on .git/objects/54/tmp_obj_ExOJVc: Interrupted system call
>    warning: failed utime() on .git/objects/e3/tmp_obj_GtPw4c: Interrupted system call
>    warning: failed utime() on .git/objects/21/tmp_obj_Xex32c: Interrupted system call
>    warning: failed utime() on .git/objects/1a/tmp_obj_CzwsZc: Interrupted system call
>    warning: failed utime() on .git/objects/18/tmp_obj_o6fp3c: Interrupted system call
>    warning: failed utime() on .git/objects/32/tmp_obj_Ih0G4c: Interrupted system call
>    warning: failed utime() on .git/objects/41/tmp_obj_1RXV1c: Interrupted system call
>    Counting objects: 137744, done.
>    Delta compression using up to 48 threads.
>    Compressing objects: 100% (36510/36510), done.
>    Writing objects: 100% (137744/137744), done.
>    Total 137744 (delta 101591), reused 135512 (delta 99472)
>
> The trace is here (2.1MB compressed):
>
>    http://thomasrast.ch/download/gc.trace.bz2
>
> For the test I used a clone of another git.git I had around.  I think
> the error is from sha1_file.c:2564.  While that doesn't look too
> important (see ca11b212 for context), it does raise the question: what
> other system calls that we never expect to EINTR can do this on
> sufficiently arcane system/FS combinations?
>
>
> Peff's test ran without any apparent issue for a few minutes.  I also
> ran an extended version (at the end) that sets alarms, so as to actually
> get interrupted.  That proved more interesting.  I had to fix verify()
> and write_in_full() to account for EINTR in read()/write(), as those
> seem likely to fail.  I also got link() to fail:
>
>    $ ~/lustre-peff-reproducer
>    unable to create hard link: Interrupted system call
>    unable to open index file: No such file or directory
>
> but it took a long time.  Unfortunately, when running it with strace I
> managed to panic the host I ran it on:
>
>    $ strace -o ~/peff-reproducer.trace ~/lustre-peff-reproducer
>
>    Message from syslogd@brutus1 at Jan 21 17:09:43 ...
>     kernel:LustreError: 37417:0:(osc_lock.c:1182:osc_lock_enqueue()) ASSERTION( ols->ols_state == OLS_NEW ) failed: Impossible state: 4
>
>    Message from syslogd@brutus1 at Jan 21 17:09:43 ...
>     kernel:LustreError: 37417:0:(osc_lock.c:1182:osc_lock_enqueue()) LBUG
>
>    Message from syslogd@brutus1 at Jan 21 17:09:43 ...
>     kernel:Kernel panic - not syncing: LBUG
>
> Yay for now having to explain this to the cluster team.
>
>
> I tried finding a standard that limits the syscalls to which EINTR
> applies, without too much success.  I'm not sure how far I should trust
> my manpages, but while some of them explicitly list EINTR as a possible
> error (read, write, etc.) link() does not.  (And the linux manpages
> agree with the POSOIX ones for once.)
>
> If somebody finds such a standard, we could of course use it to blame
> lustre instead :-)
>
> In the absence of it, wouldn't we in theory have to write a simple
> loop-on-EINTR wrapper for *all* syscalls?
>
> Of course there's the added problem that when open(O_CREAT|O_EXCL) fails
> with EINTR, it's hard to tell whether a file that may now exist is
> indeed yours or some other process's.
>
> --- 8< ----
> #include <fcntl.h>
> #include <unistd.h>
> #include <stdlib.h>
> #include <stdio.h>
> #include <string.h>
> #include <sys/time.h>
> #include <signal.h>
> #include <errno.h>
>
> struct itimerval itv;
>
> static int randomize(unsigned char *buf, int len)
> {
>    int i;
>    len = rand() % len;
>    for (i = 0; i < len; i++)
>      buf[i] = rand() & 0xff;
>    return len;
> }
>
> static int check_eof(int fd)
> {
>    int ch;
>    int r = read(fd, &ch, 1);
>    if (r < 0) {
>      perror("read error after expected EOF");
>      return -1;
>    }
>    if (r > 0) {
>      fprintf(stderr, "extra byte after expected EOF");
>      return -1;
>    }
>    return 0;
> }
>
> static int verify(int fd, const unsigned char *buf, int len)
> {
>    while (len) {
>      char to_check[4096];
>      int got = read(fd, to_check,
>                     len < sizeof(to_check) ? len : sizeof(to_check));
>
>      if (got < 0 && errno == EINTR)
>        continue;
>      if (got < 0) {
>        perror("unable to read");
>        return -1;
>      }
>      if (got == 0) {
>        fprintf(stderr, "premature EOF (%d bytes remaining)", len);
>        return -1;
>      }
>      if (memcmp(buf, to_check, got)) {
>        fprintf(stderr, "bytes differ");
>        return -1;
>      }
>
>      buf += got;
>      len -= got;
>    }
>
>    return check_eof(fd);
> }
>
> int write_in_full(int fd, const unsigned char *buf, int len)
> {
>    while (len) {
>      int r = write(fd, buf, len);
>      if (r < 0 && errno == EINTR)
>        continue;
>      if (r < 0)
>        return -1;
>      buf += r;
>      len -= r;
>    }
>    return 0;
> }
>
> int move_into_place(const char *old, const char *new)
> {
>    if (link(old, new) < 0) {
>      perror("unable to create hard link");
>      return 1;
>    }
>    unlink(old);
>    return 0;
> }
>
> void handle_alarm(int signal)
> {
> }
>
> int main(void)
> {
>    struct sigaction sa;
>
>    sa.sa_handler = handle_alarm;
>    sa.sa_flags = SA_RESTART;
>    sigaction(SIGALRM, &sa, NULL);
>
>    itv.it_interval.tv_sec = 0;
>    itv.it_interval.tv_usec = 10000;
>    itv.it_value.tv_sec = 0;
>    itv.it_value.tv_usec = 100000;
>    setitimer(ITIMER_REAL, &itv, NULL);
>
>    while (1) {
>      static unsigned char junk[1024*1024];
>      int len = randomize(junk, sizeof(junk));
>      int fd;
>
>      /* clean up from any previous round */
>      unlink("tmpfile");
>      unlink("final.idx");
>
>      fd = open("tmpfile", O_WRONLY|O_CREAT, 0666);
>      if (fd < 0) {
>        perror("unable to open tmpfile");
>        return 1;
>      }
>      if (write_in_full(fd, junk, len) < 0 ||
>          fsync(fd) < 0 ||
>          close(fd) < 0) {
>        perror("unable to write");
>        return 1;
>      }
>
>      if (move_into_place("tmpfile", "final.idx") < 0)
>        return 1;
>
>      fd = open("final.idx", O_RDONLY);
>      if (fd < 0) {
>        perror("unable to open index file");
>        return 1;
>      }
>      if (verify(fd, junk, len) < 0)
>        return 1;
>      close(fd);
>    }
> }


-- 
---------------------------------
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique

^ permalink raw reply

* Re: GIT get corrupted on lustre
From: Thomas Rast @ 2013-01-21 16:20 UTC (permalink / raw)
  To: Maxime Boissonneault
  Cc: Thomas Rast, git@vger.kernel.org, kusmabite, Eric Chamberland,
	Pyeron, Jason J CTR (US), Philippe Vaucher,
	Sébastien Boisvert
In-Reply-To: <50FD696B.5000205@calculquebec.ca>

Maxime Boissonneault <maxime.boissonneault@calculquebec.ca> writes:

> Hi Thomas,
> Can you tell me what is the version of the lustre servers and the
> lustre clients ?

$ uname -a
Linux brutus4.ethz.ch 2.6.32-279.14.1.el6.x86_64 #1 SMP Tue Nov 6 23:43:09 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
$ cat /proc/fs/lustre/version
lustre: 2.3.0
kernel: patchless_client
build:  2.3.0-RC6--PRISTINE-2.6.32-279.14.1.el6.x86_64

I have no idea what the servers are running, I only have client access.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply

* [PATCH v2] git-for-each-ref.txt: 'raw' is a supported date format
From: John Keeping @ 2013-01-21 16:22 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: Michael Haggerty
In-Reply-To: <50FD66AC.1080201@alum.mit.edu>

Commit 7dff9b3 (Support 'raw' date format) added a raw date format.
Update the git-for-each-ref documentation to include this.

Signed-off-by: John Keeping <john@keeping.me.uk>
---
On Mon, Jan 21, 2013 at 05:02:52PM +0100, Michael Haggerty wrote:
> Shouldn't "raw" be preceded with a colon like the other format specifiers?

Yes it should.  Thanks.

 Documentation/git-for-each-ref.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index db55a4e..d3e1df7 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -117,7 +117,7 @@ returns an empty string instead.
 
 As a special case for the date-type fields, you may specify a format for
 the date by adding one of `:default`, `:relative`, `:short`, `:local`,
-`:iso8601` or `:rfc2822` to the end of the fieldname; e.g.
+`:iso8601`, `:rfc2822` or `:raw` to the end of the fieldname; e.g.
 `%(taggerdate:relative)`.
 
 
-- 
1.8.1.353.gc992d5a.dirty

^ permalink raw reply related

* Re: [PATCH 2/3] t0050: honor CASE_INSENSITIVE_FS in add (with different case)
From: Torsten Bögershausen @ 2013-01-21 16:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Torsten Bögershausen, git
In-Reply-To: <50FAF5A4.5090304@web.de>

Sorry for completely messing up the sensitvie/insensiteve

Please discard my patch from pu, I'll send v2.

Sorry for the confusion
/Torsten

^ permalink raw reply

* [PATCH v2 1/3] t0050: known breakage vanished in merge (case change)
From: Torsten Bögershausen @ 2013-01-21 16:45 UTC (permalink / raw)
  To: git; +Cc: tboegi

This test case has passed since this commit:

  commit 0047dd2fd1fc1980913901c5fa098357482c2842
  Author: Steffen Prohaska <prohaska@zib.de>
  Date:   Thu May 15 07:19:54 2008 +0200

    t0050: Fix merge test on case sensitive file systems

Remove the known breakage by using test_expect_success

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
Changes since v1: Improved commit message
 t/t0050-filesystem.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t0050-filesystem.sh b/t/t0050-filesystem.sh
index 78816d9..ccd685d 100755
--- a/t/t0050-filesystem.sh
+++ b/t/t0050-filesystem.sh
@@ -77,7 +77,7 @@ $test_case 'rename (case change)' '
 
 '
 
-$test_case 'merge (case change)' '
+test_expect_success 'merge (case change)' '
 
 	rm -f CamelCase &&
 	rm -f camelcase &&
-- 
1.8.0.197.g5a90748

^ permalink raw reply related

* [PATCH v2 2/3] t0050: honor CASE_INSENSITIVE_FS in add (with different case)
From: Torsten Bögershausen @ 2013-01-21 16:46 UTC (permalink / raw)
  To: git; +Cc: tboegi

The test case "add (with different case)" indicates a
known breakage when run on a case insensitive file system.

The test is invalid for case sensitive file system, it will always fail.

Check the precondition CASE_INSENSITIVE_FS before running it.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
Changes since v1: Corrected and improved commit message
 t/t0050-filesystem.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t0050-filesystem.sh b/t/t0050-filesystem.sh
index ccd685d..a6fa3c5 100755
--- a/t/t0050-filesystem.sh
+++ b/t/t0050-filesystem.sh
@@ -88,7 +88,7 @@ test_expect_success 'merge (case change)' '
 
 
 
-test_expect_failure 'add (with different case)' '
+test_expect_failure CASE_INSENSITIVE_FS 'add (with different case)' '
 
 	git reset --hard initial &&
 	rm camelcase &&
-- 
1.8.0.197.g5a90748

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox