Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH 01/14] cache.h: add comments for git_path() and git_path_submodule()
From: Junio C Hamano @ 2011-10-13 18:37 UTC (permalink / raw)
  To: mhagger
  Cc: Junio C Hamano, git, Jeff King, Drew Northup, Jakub Narebski,
	Heiko Voigt, Johan Herland, Julian Phillips
In-Reply-To: <1318492715-5931-2-git-send-email-mhagger@alum.mit.edu>

mhagger@alum.mit.edu writes:

> +
> +/*
> + * Return the path of a file within get_git_dir().  The arguments
> + * should be printf-like arguments that produce the filename relative
> + * to get_git_dir().  Return the resulting path, or "/bad-path/" if
> + * there is an error.
> + */
>  extern char *git_path(const char *fmt, ...) __attribute__((format (printf, 1, 2)));

Ok.

> +/*
> + * Return the path of a file within the submodule located at path.

This is confusing. Does this "file within the submodule" refer to files
like "Makefile" tracked in a submodule at "dir"?  Your description for
git_path() above makes it clear that the function is about files like
"index" and "HEAD" that are part of the control information for the
current project, but the above gives an impression that you are talking
about files in the working tree of the submodule.

> + * The other arguments should be printf-like arguments that produce
> + * the filename relative to "<path>/.git".  If "<path>/.git" is a

And the reader is puzzled by the sudden mention of <path>/.git here.

> + * gitlink file, follow it to find the actual submodule git path.
> + * Return the resulting path, or "/bad-path/" if there is an error.
> + */
>  extern char *git_path_submodule(const char *path, const char *fmt, ...)
>  	__attribute__((format (printf, 2, 3)));

^ permalink raw reply

* Re: [PATCH 02/14] struct ref_list: document name member
From: Junio C Hamano @ 2011-10-13 18:37 UTC (permalink / raw)
  To: mhagger
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <1318492715-5931-3-git-send-email-mhagger@alum.mit.edu>

mhagger@alum.mit.edu writes:

> From: Michael Haggerty <mhagger@alum.mit.edu>
>
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  refs.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 409314d..e4e4bcd 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -12,6 +12,7 @@ struct ref_entry {
>  	unsigned char flag; /* ISSYMREF? ISPACKED? */
>  	unsigned char sha1[20];
>  	unsigned char peeled[20];
> +	/* The full name of the reference (e.g., "refs/heads/master"): */
>  	char name[FLEX_ARRAY];
>  };

Thanks. Needs retitling the patch, though.

^ permalink raw reply

* Re: [PATCH 04/14] refs: rename some parameters result -> sha1
From: Junio C Hamano @ 2011-10-13 18:42 UTC (permalink / raw)
  To: mhagger
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <1318492715-5931-5-git-send-email-mhagger@alum.mit.edu>

mhagger@alum.mit.edu writes:

> From: Michael Haggerty <mhagger@alum.mit.edu>
>
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>

[PATCH 03/14] was about a similar topic and explained itself a lot
better.

Even though it hinted as if it may be incomplete by saying "some" in the
subject, it was clear from the description that it consistently renamed
all the "name"s that are about references, not just "some" randomly chosen
ones. It would have been better if the subject did not say "some" to avoid
such implications.

Please give similar love to these sha1[] object names in this patch.

> ---
>  refs.c |   16 ++++++++--------
>  refs.h |    2 +-
>  2 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 2ae5d0d..c466fcd 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -398,7 +398,7 @@ static struct ref_array *get_loose_refs(const char *submodule)
>  #define MAXREFLEN (1024)
>  
>  static int resolve_gitlink_packed_ref(char *name, int pathlen,
> -				      const char *refname, unsigned char *result)
> +				      const char *refname, unsigned char *sha1)
>  {
>  	int retval = -1;
>  	struct ref_entry *ref;
> @@ -406,14 +406,14 @@ static int resolve_gitlink_packed_ref(char *name, int pathlen,
>  
>  	ref = search_ref_array(array, refname);
>  	if (ref != NULL) {
> -		memcpy(result, ref->sha1, 20);
> +		memcpy(sha1, ref->sha1, 20);
>  		retval = 0;
>  	}
>  	return retval;
>  }
>  
>  static int resolve_gitlink_ref_recursive(char *name, int pathlen,
> -					 const char *refname, unsigned char *result,
> +					 const char *refname, unsigned char *sha1,
>  					 int recursion)
>  {
>  	int fd, len = strlen(refname);
> @@ -424,7 +424,7 @@ static int resolve_gitlink_ref_recursive(char *name, int pathlen,
>  	memcpy(name + pathlen, refname, len+1);
>  	fd = open(name, O_RDONLY);
>  	if (fd < 0)
> -		return resolve_gitlink_packed_ref(name, pathlen, refname, result);
> +		return resolve_gitlink_packed_ref(name, pathlen, refname, sha1);
>  
>  	len = read(fd, buffer, sizeof(buffer)-1);
>  	close(fd);
> @@ -435,7 +435,7 @@ static int resolve_gitlink_ref_recursive(char *name, int pathlen,
>  	buffer[len] = 0;
>  
>  	/* Was it a detached head or an old-fashioned symlink? */
> -	if (!get_sha1_hex(buffer, result))
> +	if (!get_sha1_hex(buffer, sha1))
>  		return 0;
>  
>  	/* Symref? */
> @@ -445,10 +445,10 @@ static int resolve_gitlink_ref_recursive(char *name, int pathlen,
>  	while (isspace(*p))
>  		p++;
>  
> -	return resolve_gitlink_ref_recursive(name, pathlen, p, result, recursion+1);
> +	return resolve_gitlink_ref_recursive(name, pathlen, p, sha1, recursion+1);
>  }
>  
> -int resolve_gitlink_ref(const char *path, const char *refname, unsigned char *result)
> +int resolve_gitlink_ref(const char *path, const char *refname, unsigned char *sha1)
>  {
>  	int len = strlen(path), retval;
>  	char *gitdir;
> @@ -472,7 +472,7 @@ int resolve_gitlink_ref(const char *path, const char *refname, unsigned char *re
>  	}
>  	gitdir[len] = '/';
>  	gitdir[++len] = '\0';
> -	retval = resolve_gitlink_ref_recursive(gitdir, len, refname, result, 0);
> +	retval = resolve_gitlink_ref_recursive(gitdir, len, refname, sha1, 0);
>  	free(gitdir);
>  	return retval;
>  }
> diff --git a/refs.h b/refs.h
> index 13e2aa3..c6b8749 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -133,7 +133,7 @@ extern char *shorten_unambiguous_ref(const char *refname, int strict);
>  extern int rename_ref(const char *oldref, const char *newref, const char *logmsg);
>  
>  /** resolve ref in nested "gitlink" repository */
> -extern int resolve_gitlink_ref(const char *name, const char *refname, unsigned char *result);
> +extern int resolve_gitlink_ref(const char *name, const char *refname, unsigned char *sha1);
>  
>  /** lock a ref and then write its file */
>  enum action_on_err { MSG_ON_ERR, DIE_ON_ERR, QUIET_ON_ERR };

^ permalink raw reply

* Re: [PATCH 05/14] clear_ref_list(): rename from free_ref_list()
From: Junio C Hamano @ 2011-10-13 18:43 UTC (permalink / raw)
  To: mhagger
  Cc: Junio C Hamano, git, Jeff King, Drew Northup, Jakub Narebski,
	Heiko Voigt, Johan Herland, Julian Phillips
In-Reply-To: <1318492715-5931-6-git-send-email-mhagger@alum.mit.edu>

mhagger@alum.mit.edu writes:

> From: Michael Haggerty <mhagger@alum.mit.edu>
>
> Rename the function since it doesn't actually free the array object
> that is passed to it.

The commit log message correctly refers to the "array-ness" of the object
being cleared. Needs retitling the patch to match.

>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  refs.c |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index c466fcd..a2e48e4 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -149,7 +149,7 @@ static struct ref_entry *current_ref;
>  
>  static struct ref_array extra_refs;
>  
> -static void free_ref_array(struct ref_array *array)
> +static void clear_ref_array(struct ref_array *array)
>  {
>  	int i;
>  	for (i = 0; i < array->nr; i++)
> @@ -162,14 +162,14 @@ static void free_ref_array(struct ref_array *array)
>  static void clear_cached_packed_refs(struct cached_refs *refs)
>  {
>  	if (refs->did_packed)
> -		free_ref_array(&refs->packed);
> +		clear_ref_array(&refs->packed);
>  	refs->did_packed = 0;
>  }
>  
>  static void clear_cached_loose_refs(struct cached_refs *refs)
>  {
>  	if (refs->did_loose)
> -		free_ref_array(&refs->loose);
> +		clear_ref_array(&refs->loose);
>  	refs->did_loose = 0;
>  }
>  
> @@ -256,7 +256,7 @@ void add_extra_ref(const char *refname, const unsigned char *sha1, int flag)
>  
>  void clear_extra_refs(void)
>  {
> -	free_ref_array(&extra_refs);
> +	clear_ref_array(&extra_refs);
>  }
>  
>  static struct ref_array *get_packed_refs(const char *submodule)

^ permalink raw reply

* Re: [PATCH 06/14] resolve_gitlink_ref(): improve docstring
From: Junio C Hamano @ 2011-10-13 18:48 UTC (permalink / raw)
  To: mhagger
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <1318492715-5931-7-git-send-email-mhagger@alum.mit.edu>

mhagger@alum.mit.edu writes:

> From: Michael Haggerty <mhagger@alum.mit.edu>
>
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>

> -/** resolve ref in nested "gitlink" repository */
> +/**
> + * Resolve refname in the nested "gitlink" repository that is located
> + * at name.  If the resolution is successful, return 0 and set sha1 to
> + * the name of the object; otherwise, return a non-zero value.
> + */

It is clear that "refname" would refer to things like "refs/heads/master",
but "name" is still not clear enough with the description. 'repository
that is located at name' hints that we may be dealing with more than one
repository and 'name' is a way to identify which one, but perhaps "path"
or "submodule" a much clearer way to indicate what the code is doing.

At the UI level, a submodule has "name" and "path" that are often the same
but can be different (e.g. when the superproject moves a submodule that
used to be bound to path "dir" to a different location, only the latter
should change). I do not think resolve_gitlink_ref() takes the submodule
name, but it takes the path to the submodule in the superproject. In that
sense, "submodule_path" would be the clearest descriptive name for this
parameter.

>  extern int resolve_gitlink_ref(const char *name, const char *refname, unsigned char *sha1);

^ permalink raw reply

* Re: [PATCH 07/14] is_refname_available(): remove the "quiet" argument
From: Junio C Hamano @ 2011-10-13 18:49 UTC (permalink / raw)
  To: Drew Northup
  Cc: mhagger, git, Jeff King, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <1318509685.7231.6.camel@drew-northup.unet.maine.edu>

Drew Northup <drew.northup@maine.edu> writes:

> On Thu, 2011-10-13 at 09:58 +0200, mhagger@alum.mit.edu wrote:
>> From: Michael Haggerty <mhagger@alum.mit.edu>
>> 
>> quiet was always set to 0, so get rid of it.  Add a function docstring
>> for good measure.
>
> I would like to know if perhaps it was an unfinished project somewhere
> to propagate the "quiet" option down to this level before removing the
> function argument. Comments?

Have you tried blaming?

^ permalink raw reply

* Re: [CLOSED] git checkout <branch> allowed with uncommitted changes
From: Alexey Shumkin @ 2011-10-13 18:56 UTC (permalink / raw)
  To: arQon; +Cc: git
In-Reply-To: <loom.20111013T181801-923@post.gmane.org>

> (Though if someone can come up with a script / hook / whatever that
> improves the "visibility" of stash, that would be awesome. 
> 
"git-completion" for Bash/ZSH gives such an opportunity
I use it

take a look into 
<git-sources>/contrib/completion/git-completion.bash
-- >8 --
#    3) Consider changing your PS1 to also show the current branch:
#         Bash: PS1='[\u@\h \W$(__git_ps1 " (%s)")]\$ '
#         ZSH:  PS1='[%n@%m %c$(__git_ps1 " (%s)")]\$ '
#
#       The argument to __git_ps1 will be displayed only if you
#       are currently in a git repository.  The %s token will be
#       the name of the current branch.
#
#       In addition, if you set GIT_PS1_SHOWDIRTYSTATE to a nonempty
#       value, unstaged (*) and staged (+) changes will be shown next
#       to the branch name.  You can configure this per-repository
#       with the bash.showDirtyState variable, which defaults to true
#       once GIT_PS1_SHOWDIRTYSTATE is enabled.
#
#       You can also see if currently something is stashed, by setting
#       GIT_PS1_SHOWSTASHSTATE to a nonempty value. If something is
stashed, #       then a '$' will be shown next to the branch name.
#
#       If you would like to see if there're untracked files, then you
can #       set GIT_PS1_SHOWUNTRACKEDFILES to a nonempty value. If
there're #       untracked files, then a '%' will be shown next to the
branch name. #
#       If you would like to see the difference between HEAD and its
#       upstream, set GIT_PS1_SHOWUPSTREAM="auto".  A "<" indicates
#       you are behind, ">" indicates you are ahead, and "<>"
#       indicates you have diverged.
-- >8 --
my .bashrc contains (shortly)
PS1='\[\e]0;\w [$(__git_ps1 "%s")]\a\]\n\[\e[32m\]\u@\h'
PS1=$PS1' \[\e[33m\]\w\[\e[0m\]\n[$(__git_ps1 "%s")]\n\$'

export PS1
export GIT_PS1_SHOWDIRTYSTATE=1
export GIT_PS1_SHOWSTASHSTATE=1
export GIT_PS1_SHOWUPSTREAM="auto"

and console prompt with all possible cases looks like

<username>@<hostname> ~/Git-src.git/contrib/completion
[post-receive-email *+$>]
$ 

* - I have unstaged changes
+ - I have staged changes
$ - I have stashed changes (ta-daaa!)
> - I have commits ahead upsteam (named branch I branched from)

P.S. 
And JFYI, it is a good form in mailing lists to CC (Reply to all)
participants

^ permalink raw reply

* Re: [BUG] git checkout <branch> allowed with uncommitted changes
From: arQon @ 2011-10-13 18:56 UTC (permalink / raw)
  To: git
In-Reply-To: <7vzkh44ug1.fsf@alter.siamese.dyndns.org>

Junio C Hamano <gitster <at> pobox.com> writes:
> Perhaps you can prepare a documentation patch to make it clear that git
> WILL preserve the LOCAL CHANGES to the working tree?

Or to put that another way, git WILL NOT "rewind" those local changes when
switching branches (which I still think is the more common case for new users
than failing to branch before editing files). Or refuse to switch if you have
some. Except for when it does.

I'll give a shot, though I don't know how good it'll be. Off the top of my
head, I don't see any good way to explain the inconsistency with LOCAL CHANGES
sometimes preventing switches and sometimes not, based on what is to the user
an arbitrary set of rules that has nothing to do with the *current state* of
the worktree, but rather the state of those files in prior commits.

But sure, I'll see if I can come up with something. If nothing else, having the
manpage at least explain what "M" means; that it can be potentially disastrous;
and what you need to do to avoid it, would be a definite plus.

^ permalink raw reply

* Re: [CLOSED] git checkout <branch> allowed with uncommitted changes
From: Jakub Narebski @ 2011-10-13 19:01 UTC (permalink / raw)
  To: arQon; +Cc: git
In-Reply-To: <loom.20111013T181801-923@post.gmane.org>

arQon <arqon@gmx.com> writes:

> (Though if someone can come up with a script / hook / whatever that improves
> the "visibility" of stash, that would be awesome. Or one that makes the
> refusal to switch branches consistent).

Well, if you use __git_ps1 from contrib/completion/git-completion.bash
(installed with git-core package for some time), there is an option to
add '$' to branch name if stash is non-empty (though it doesn't actually
check if stash was on said branch).

> Looking at the manpage for checkout in the hope that there might be a "--safe"
> switch, I don't understand why
>
>   "-f  Proceed even if the index *or the working tree* differs from HEAD."
>
> even exists, since it proceeds under those conditions anyway.
> "--safe" appears to be exactly what the behavior should be if you DON'T
> specify -f, except that -f nukes the working tree outright rather than just
> bleeding it across. Hopefully it'll be clearer after some sleep.  :)

Without '-f' git-checkout would switch branches only if uncomitted
changes (which do not belong to any branch) could be "floated" on top
of new branch.

If branch you are switching to has differences from current branch
that conflict with uncomitted changes, git would refuse switching
branches.  Now '-f' would get rid of your uncomitted changes, and '-m'
try to merge it with changes brought by new branch.

HTH
-- 
Jakub Narębski

^ permalink raw reply

* Re: [Bug] git pull doesn't recognize --work-tree parameter
From: Junio C Hamano @ 2011-10-13 19:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Kirill Likhodedov, git
In-Reply-To: <20111013183709.GB17573@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> And then convert the two scripts in my patch to use it (along with the
> change to require_work_tree_exists).  That would make my prior analysis
> hold, then, as the annoying do-nothing behavior of "cd_to_toplevel" only
> kicks in when we are outside the work tree (i.e., it could not have
> happened before in those scripts, because the existing require_work_tree
> call would cause us to die).
> ...
> Right. I suspect the proposed behavior for cd_to_toplevel is what they
> all would want eventually, but some scripts may need minor tweaks. I
> think we should follow the same path as require_work_tree_exists, and
> introduce the new function, use it where we know it's safe, and then
> eventually get rid of the old one.
>
> The real trick is coming up with a good name, because cd_to_toplevel is
> already taken. :)

It is not as simple as that I am afraid. We could introduce cd_to_top with
the new semantics and use it in pull and rebase, but a case that would
break is for a script (let's call that hypothetical operation "git svn
dcommit", even though I do not know if dcommit uses the real working tree
or a temporary one) that prepares a temporary working tree inside .git/svn/
and run "git rebase" there without setting GIT_WORKING_TREE to point at
the temporary directory.

With cd_to_toplevel, such a "rebase" would work and "git svn dcommit" can
take that result and do whatever it wants to the real working tree after
it finishes. When we start using cd_to_top in the updated "rebase", such a
script suddenly breaks, as we would start touching the real working tree.

So I do not think it makes much sense to add cd_to_top with updated
semantics while keeping cd_to_toplevel.

What we could do is to update cd_to_toplevel so that it would notice and
warn when the results between the historical incorrect behaviour and the
updated behaviour would be different. The warning can first read "You are
running 'rebase' somewhere in $GIT_DIR without setting $GIT_WORK_TREE; we
historically used the directory you started 'rebase' as the top level of
the working tree, and this version continues to do so, but it will change
to work on the real working tree associated with your $GIT_DIR in future
versions of git. Update your script to correctly set $GIT_WORK_TREE", and
then we transition to start using the new semantics while rewording the
warning message, and then later remove the warning altogether.

^ permalink raw reply

* Re: [Bug] git pull doesn't recognize --work-tree parameter
From: Jeff King @ 2011-10-13 19:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kirill Likhodedov, git
In-Reply-To: <7v62js4sop.fsf@alter.siamese.dyndns.org>

On Thu, Oct 13, 2011 at 12:06:46PM -0700, Junio C Hamano wrote:

> It is not as simple as that I am afraid. We could introduce cd_to_top with
> the new semantics and use it in pull and rebase, but a case that would
> break is for a script (let's call that hypothetical operation "git svn
> dcommit", even though I do not know if dcommit uses the real working tree
> or a temporary one) that prepares a temporary working tree inside .git/svn/
> and run "git rebase" there without setting GIT_WORKING_TREE to point at
> the temporary directory.

I didn't think that could happen now, because you would not be in the
working tree, and therefore require_work_tree would fail. E.g., with
current git I get:

  $ mkdir .git/tmp
  $ cd .git/tmp
  $ git rebase
  fatal: fatal: /home/peff/local/git/private/libexec/git-core/git-rebase
  cannot be used without a working tree.

So that case is already broken. The only change this would make is that
what used to fail would not actually take them to the top-level of the
working tree[1].

-Peff

[1] Actually, I am not sure it would do that. If we are in $GIT_DIR, do
we necessarily know where the working tree is? I guess in a non-bare
repo, we assume it is $GIT_DIR/..?

^ permalink raw reply

* Re: [Bug] git pull doesn't recognize --work-tree parameter
From: Jeff King @ 2011-10-13 19:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kirill Likhodedov, git
In-Reply-To: <20111013191457.GA18460@sigill.intra.peff.net>

On Thu, Oct 13, 2011 at 03:14:57PM -0400, Jeff King wrote:

> On Thu, Oct 13, 2011 at 12:06:46PM -0700, Junio C Hamano wrote:
> 
> > It is not as simple as that I am afraid. We could introduce cd_to_top with
> > the new semantics and use it in pull and rebase, but a case that would
> > break is for a script (let's call that hypothetical operation "git svn
> > dcommit", even though I do not know if dcommit uses the real working tree
> > or a temporary one) that prepares a temporary working tree inside .git/svn/
> > and run "git rebase" there without setting GIT_WORKING_TREE to point at
> > the temporary directory.
> 
> I didn't think that could happen now, because you would not be in the
> working tree, and therefore require_work_tree would fail. E.g., with
> current git I get:
> 
>   $ mkdir .git/tmp
>   $ cd .git/tmp
>   $ git rebase
>   fatal: fatal: /home/peff/local/git/private/libexec/git-core/git-rebase
>   cannot be used without a working tree.
> 
> So that case is already broken. The only change this would make is that
> what used to fail would not actually take them to the top-level of the
> working tree[1].

Ugh. It does work if you do:

  mkdir .git/tmp
  cd .git/tmp
  GIT_DIR=$PWD/.. git rebase

What a god-awful mess our initialization rules are.

-Peff

^ permalink raw reply

* Re: [BUG] git checkout <branch> allowed with uncommitted changes
From: PJ Weisberg @ 2011-10-13 19:44 UTC (permalink / raw)
  To: git
In-Reply-To: <loom.20111013T171530-970@post.gmane.org>

On Thu, Oct 13, 2011 at 8:53 AM, arQon <arqon@gmx.com> wrote:
>>git co master
> error: Your local changes to the following files would be overwritten by
> checkout:
>        file1.txt
> Please, commit your changes or stash them before you can switch branches.
> Aborting
>
> I'm sure if I thought about it enough (ie re-read Andreas's post a couple
> more times) I'd be able to understand why git gets it right sometimes but
> not other times, but I'm too tired right now. Even when I *am* awake and

Git gets it "right" (by your definition) when file1.txt on one branch
is different from file1.txt on the other branch.  That means that
switching branches would require changing the file, so it refuses to
overwrite your changes by doing so.  If it CAN switch branches without
losing your changes, it does.

The fundamental problem is that you're thinking of the changes to the
working tree (which aren't commited) as being "on" some branch.  Until
they're committed, changes in the working tree are only in the working
tree.  That's basically the difference between "committed" and "not
committed".

-PJ

^ permalink raw reply

* Re: [BUG] git checkout <branch> allowed with uncommitted changes
From: Carlos Martín Nieto @ 2011-10-13 20:07 UTC (permalink / raw)
  To: arQon; +Cc: git
In-Reply-To: <loom.20111013T193054-868@post.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 3810 bytes --]

On Thu, 2011-10-13 at 18:19 +0000, arQon wrote:
> Carlos Martín Nieto <cmn <at> elego.de> writes:
> > If file1.txt in the foo branch is different from the one in the master
> > branch, git will refuse to switch branches. 'git diff foo master' should
> > show that those two files are different.
> 
> Right, but only for a definition of "branch" that is actually "a fully
> committed branch", hence the confusion and the mention of "uncommitted
> changes" in the topic.

I'm not aware of any other definition of a branch, either for git or
subversion.

> 
> An expectation that "co branch" should be analogous to "cd ../branch/" is by
> no means unreasonable. YOU may know better, but it's surprisingly non-obvious,

I don't see how. Switching branches is not the same as changing
directories. It doesn't work that way, neither with git nor subversion.
If you choose to have each branch in their own directory, that's fine,
but it has very little to do with the VCS tool.

> especially considering the -f option on checkout and the wording of -m, both
> of which strongly suggest that, in the absence of either of those flags, git
> WILL preserve the worktree by refusing to switch until that potentially-
> harmful situation is resolved by the user.

The general description could probably benefit from a more explicit
mention of what happens if there are local modifications. Currently it
looks like it's only mentioned in the text of -f and -m, which is not
particularly helpful.

> 
> > Committing non-working code is fine, as long as you don't push it out.
> 
> Right, but for the problem I was describing it's actually "committing
> non-working code is a requirement, in this situation, if you don't want your
> tree to get eaten". Going from "you absolutely must not do this" to "you must
> do this" takes some mental adjustment, but you also have to be *aware* that
> you now have to do something that was previously prohibited, which I wasn't.

You can also have a different directory for the other branch if you
really don't want to commit until it works. This is the same situation
that you find yourself right now with subversion; I don't see how it's
that hard to recognise that.

> 
> > The bigger problem seems to be your reluctance to accept that git is
> > different from subversion
> 
> Not at all. If I didn't WANT something different, I wouldn't have been trying
> to move to git in the first place.  :)
> 
> > but don't go around saying that git
> > corrupts branches when that's blatantly not true.
> 
> See my first para in this post (or indeed, the original post). It's "not true"
> provided all branches are fully committed when you switch between them.

Right.

> It blatantly IS true if you switch from a dirty branch.

No. The branch has not been corrupted or changed at all. Your local
modifications to files in the working tree were kept. Again, this
happens both for git and svn.

> Redefining "branch" to mean "fully committed branch" makes it "not true" in
> that context, but so does redefining green to be red and saying that grass is
> red in that context: it may be correct from a certain POV, but it's
> incomprehensible to anyone who isn't aware of that semantic change.

This smells like FUD. A branch and a directory are two different things.
If you find it more comfortable to use different directories for
different branches that's fine, but that doesn't make it a branch.
Changing a file doesn't automatically mean that that version of the file
belongs to the currently active branch (or URL in the case for svn). A
branch is only ever changed when you commit. This is something that
holds true across VCSs. Play with subversion's 'switch' command, it
behaves the same way.

   cmn

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* [PATCH 0/1] transport: Use is_bundle instead of !is_gitfile
From: Phil Hord @ 2011-10-13 20:26 UTC (permalink / raw)
  To: Git List, Junio C Hamano; +Cc: Phil Hord

I'm not sure where this patch should go now since I see my
original patchset on both next and pu.  Should I re-roll the
original series with this one merged in, or just submit this
one isolated, as I am doing here?

Pardon my newbieness.

Phil Hord (1):
  transport: Add is_bundle() to detect bundle urls

 bundle.c    |   16 ++++++++++++++++
 bundle.h    |    1 +
 transport.c |   10 +---------
 3 files changed, 18 insertions(+), 9 deletions(-)

-- 
1.7.7.334.g311c9.dirty

^ permalink raw reply

* [PATCH 1/1] transport: Add is_bundle() to detect bundle urls
From: Phil Hord @ 2011-10-13 20:26 UTC (permalink / raw)
  To: Git List, Junio C Hamano; +Cc: Phil Hord
In-Reply-To: <1318537562-18581-1-git-send-email-hordp@cisco.com>

Transport decides that any local url which is a file
must be a bundle.  This is wrong if the local url is a
gitfile. Teach transport to verify the file is
really a bundle instead of just assuming it is so.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phil Hord <hordp@cisco.com>
---
 bundle.c    |   16 ++++++++++++++++
 bundle.h    |    1 +
 transport.c |   10 +---------
 3 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/bundle.c b/bundle.c
index f82baae..cc0624d 100644
--- a/bundle.c
+++ b/bundle.c
@@ -23,6 +23,22 @@ static void add_to_ref_list(const unsigned char *sha1, const char *name,
 	list->nr++;
 }
 
+int is_bundle(const char *path)
+{
+	char buffer[100];
+	FILE *ffd = fopen(path, "rb");
+	int ret=1;
+
+	if (!ffd)
+		return 0;
+
+	if (!fgets(buffer, sizeof(buffer), ffd) ||
+			strcmp(buffer, bundle_signature))
+		ret=0;
+	fclose(ffd);
+	return ret;
+}
+
 /* returns an fd */
 int read_bundle_header(const char *path, struct bundle_header *header)
 {
diff --git a/bundle.h b/bundle.h
index c5a22c8..35aa0eb 100644
--- a/bundle.h
+++ b/bundle.h
@@ -14,6 +14,7 @@ struct bundle_header {
 	struct ref_list references;
 };
 
+int is_bundle(const char *path);
 int read_bundle_header(const char *path, struct bundle_header *header);
 int create_bundle(struct bundle_header *header, const char *path,
 		int argc, const char **argv);
diff --git a/transport.c b/transport.c
index 57138d9..c76f5b7 100644
--- a/transport.c
+++ b/transport.c
@@ -881,14 +881,6 @@ static int is_gitfile(const char *url)
 	return !prefixcmp(buf, "gitdir: ");
 }
 
-static int is_file(const char *url)
-{
-	struct stat buf;
-	if (stat(url, &buf))
-		return 0;
-	return S_ISREG(buf.st_mode);
-}
-
 static int external_specification_len(const char *url)
 {
 	return strchr(url, ':') - url;
@@ -929,7 +921,7 @@ struct transport *transport_get(struct remote *remote, const char *url)
 		ret->fetch = fetch_objs_via_rsync;
 		ret->push = rsync_transport_push;
 		ret->smart_options = NULL;
-	} else if (is_local(url) && is_file(url) && !is_gitfile(url)) {
+	} else if (is_local(url) && is_bundle(url)) {
 		struct bundle_transport_data *data = xcalloc(1, sizeof(*data));
 		ret->data = data;
 		ret->get_refs_list = get_refs_from_bundle;
-- 
1.7.7.334.g311c9.dirty

^ permalink raw reply related

* Re: [PATCH 10/14] is_dup_ref(): extract function from sort_ref_list()
From: Junio C Hamano @ 2011-10-13 20:43 UTC (permalink / raw)
  To: mhagger
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <1318492715-5931-11-git-send-email-mhagger@alum.mit.edu>

mhagger@alum.mit.edu writes:

> ...
> @@ -94,15 +113,11 @@ static void sort_ref_array(struct ref_array *array)
>  
>  	qsort(array->refs, array->nr, sizeof(*array->refs), ref_entry_cmp);
>  
> -	/* Remove any duplicates from the ref_array */
> +	/* Remove any duplicates from the ref_list */

Eh... Also needs retitled.

>  	for (; j < array->nr; j++) {
>  		struct ref_entry *a = array->refs[i];
>  		struct ref_entry *b = array->refs[j];
> -		if (!strcmp(a->name, b->name)) {
> -			if (hashcmp(a->sha1, b->sha1))
> -				die("Duplicated ref, and SHA1s don't match: %s",
> -				    a->name);
> -			warning("Duplicated ref: %s", a->name);
> +		if (is_dup_ref(a, b)) {
>  			free(b);
>  			continue;
>  		}

^ permalink raw reply

* Re: [Bug] git pull doesn't recognize --work-tree parameter
From: Junio C Hamano @ 2011-10-13 20:48 UTC (permalink / raw)
  To: Jeff King; +Cc: Kirill Likhodedov, git
In-Reply-To: <20111013191457.GA18460@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> I didn't think that could happen now, because you would not be in the
> working tree, and therefore require_work_tree would fail.

Ok, then I was worried about a non-existing problem, which is good.

Then we can introduce cd_to_top that is more sensible than cd_to_toplevel
and perhaps add some warning to the latter about deprecation and
migration.

Thanks.

^ permalink raw reply

* [RFC/PATCH 1/2] bundle: allowing to read from an unseekable fd
From: Junio C Hamano @ 2011-10-13 22:32 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git

The current code opens a given file with fopen(), reads it until the end
of the header and runs ftell(), and reopens the same file with open() and
seeks to skip the header. This structure makes it hard to retarget the
code to read from input that is not seekable, such as a network socket.

This patch by itself does not reach that goal yet, but I think it is a
right step in that direction.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * It would be nice if we can avoid byte-by-byte reading from the file
   descriptor by over-reading into the strbuf and pass the remainder to
   the caller of read_bundle_header(), but I suspect that it would require
   us to carry the "here is the remainder from the previous read" buffer
   around throughout the transport layer. Parsing of the header wouldn't
   be performance critical compared to the computation cost of actually
   reading the rest of the bundle, hopefully, so...

 bundle.c |   99 ++++++++++++++++++++++++++++++++++++++++----------------------
 1 files changed, 64 insertions(+), 35 deletions(-)

diff --git a/bundle.c b/bundle.c
index f48fd7d..3aa715c 100644
--- a/bundle.c
+++ b/bundle.c
@@ -23,49 +23,78 @@ static void add_to_ref_list(const unsigned char *sha1, const char *name,
 	list->nr++;
 }
 
-/* returns an fd */
+/* Eventually this should go to strbuf.[ch] */
+static int strbuf_readline_fd(struct strbuf *sb, int fd)
+{
+	strbuf_reset(sb);
+
+	while (1) {
+		char ch;
+		ssize_t len = xread(fd, &ch, 1);
+		if (len < 0)
+			return -1;
+		strbuf_addch(sb, ch);
+		if (ch == '\n')
+			break;
+	}
+	return 0;
+}
+
 int read_bundle_header(const char *path, struct bundle_header *header)
 {
-	char buffer[1024];
-	int fd;
-	long fpos;
-	FILE *ffd = fopen(path, "rb");
+	struct strbuf buf = STRBUF_INIT;
+	int fd = open(path, O_RDONLY);
+	int status = 0;
 
-	if (!ffd)
+	if (fd < 0)
 		return error("could not open '%s'", path);
-	if (!fgets(buffer, sizeof(buffer), ffd) ||
-			strcmp(buffer, bundle_signature)) {
-		fclose(ffd);
-		return error("'%s' does not look like a v2 bundle file", path);
+
+	/* The bundle header begins with the signature */
+	if (strbuf_readline_fd(&buf, fd) ||
+	    strcmp(buf.buf, bundle_signature)) {
+		error("'%s' does not look like a v2 bundle file", path);
+		status = -1;
+		goto abort;
 	}
-	while (fgets(buffer, sizeof(buffer), ffd)
-			&& buffer[0] != '\n') {
-		int is_prereq = buffer[0] == '-';
-		int offset = is_prereq ? 1 : 0;
-		int len = strlen(buffer);
+
+	/* The bundle header ends with an empty line */
+	while (!strbuf_readline_fd(&buf, fd) &&
+	       buf.len && buf.buf[0] != '\n') {
 		unsigned char sha1[20];
-		struct ref_list *list = is_prereq ? &header->prerequisites
-			: &header->references;
-		char delim;
-
-		if (len && buffer[len - 1] == '\n')
-			buffer[len - 1] = '\0';
-		if (get_sha1_hex(buffer + offset, sha1)) {
-			warning("unrecognized header: %s", buffer);
-			continue;
+		int is_prereq = 0;
+
+		if (*buf.buf == '-') {
+			is_prereq = 1;
+			strbuf_remove(&buf, 0, 1);
+		}
+		strbuf_rtrim(&buf);
+
+		/*
+		 * Tip lines have object name, SP, and refname.
+		 * Prerequisites have object name that is optionally
+		 * followed by SP and subject line.
+		 */
+		if (get_sha1_hex(buf.buf, sha1) ||
+		    (40 <= buf.len && !isspace(buf.buf[40])) ||
+		    (!is_prereq && buf.len <= 40)) {
+			error("unrecognized header: %s%s (%d)",
+			      (is_prereq ? "-" : ""), buf.buf, (int)buf.len);
+			status = -1;
+			break;
+		} else {
+			if (is_prereq)
+				add_to_ref_list(sha1, "", &header->prerequisites);
+			else
+				add_to_ref_list(sha1, buf.buf + 41, &header->references);
 		}
-		delim = buffer[40 + offset];
-		if (!isspace(delim) && (delim != '\0' || !is_prereq))
-			die ("invalid header: %s", buffer);
-		add_to_ref_list(sha1, isspace(delim) ?
-				buffer + 41 + offset : "", list);
 	}
-	fpos = ftell(ffd);
-	fclose(ffd);
-	fd = open(path, O_RDONLY);
-	if (fd < 0)
-		return error("could not open '%s'", path);
-	lseek(fd, fpos, SEEK_SET);
+
+ abort:
+	if (status) {
+		close(fd);
+		fd = -1;
+	}
+	strbuf_release(&buf);
 	return fd;
 }
 
-- 
1.7.7.289.gd0d4bb

^ permalink raw reply related

* [PATCH 2/2] bundle: add parse_bundle_header() helper function
From: Junio C Hamano @ 2011-10-13 22:35 UTC (permalink / raw)
  To: git; +Cc: Shawn O. Pearce, Phil Hord
In-Reply-To: <7vpqi034l0.fsf@alter.siamese.dyndns.org>

Move most of the code from read_bundle_header() to parse_bundle_header()
that takes a file descriptor that is already opened for reading, and make
the former responsible only for opening the file and noticing errors.

As a logical consequence of this, is_bundle() helper function can be
implemented as a non-complaining variant of read_bundle_header() that
does not return an open file descriptor, and can be used to tighten
the check used to decide the use of bundle transport in transport.c

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * It generally is a bad practice to base a non-RFC patch on an RFC one,
   but in any case here is how I would do the is_bundle() helper.

 bundle.c    |   39 +++++++++++++++++++++++++++++++--------
 bundle.h    |    1 +
 transport.c |    2 +-
 3 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/bundle.c b/bundle.c
index 3aa715c..105b005 100644
--- a/bundle.c
+++ b/bundle.c
@@ -40,19 +40,18 @@ static int strbuf_readline_fd(struct strbuf *sb, int fd)
 	return 0;
 }
 
-int read_bundle_header(const char *path, struct bundle_header *header)
+static int parse_bundle_header(int fd, struct bundle_header *header,
+			       const char *report_path)
 {
 	struct strbuf buf = STRBUF_INIT;
-	int fd = open(path, O_RDONLY);
 	int status = 0;
 
-	if (fd < 0)
-		return error("could not open '%s'", path);
-
 	/* The bundle header begins with the signature */
 	if (strbuf_readline_fd(&buf, fd) ||
 	    strcmp(buf.buf, bundle_signature)) {
-		error("'%s' does not look like a v2 bundle file", path);
+		if (report_path)
+			error("'%s' does not look like a v2 bundle file",
+			      report_path);
 		status = -1;
 		goto abort;
 	}
@@ -77,8 +76,9 @@ int read_bundle_header(const char *path, struct bundle_header *header)
 		if (get_sha1_hex(buf.buf, sha1) ||
 		    (40 <= buf.len && !isspace(buf.buf[40])) ||
 		    (!is_prereq && buf.len <= 40)) {
-			error("unrecognized header: %s%s (%d)",
-			      (is_prereq ? "-" : ""), buf.buf, (int)buf.len);
+			if (report_path)
+				error("unrecognized header: %s%s (%d)",
+				      (is_prereq ? "-" : ""), buf.buf, (int)buf.len);
 			status = -1;
 			break;
 		} else {
@@ -98,6 +98,29 @@ int read_bundle_header(const char *path, struct bundle_header *header)
 	return fd;
 }
 
+int read_bundle_header(const char *path, struct bundle_header *header)
+{
+	int fd = open(path, O_RDONLY);
+
+	if (fd < 0)
+		return error("could not open '%s'", path);
+	return parse_bundle_header(fd, header, path);
+}
+
+int is_bundle(const char *path, int quiet)
+{
+	struct bundle_header header;
+	int fd = open(path, O_RDONLY);
+
+	if (fd < 0)
+		return 0;
+	memset(&header, 0, sizeof(header));
+	fd = parse_bundle_header(fd, &header, quiet ? NULL : path);
+	if (fd >= 0)
+		close(fd);
+	return (fd >= 0);
+}
+
 static int list_refs(struct ref_list *r, int argc, const char **argv)
 {
 	int i;
diff --git a/bundle.h b/bundle.h
index e2aedd6..c6228e2 100644
--- a/bundle.h
+++ b/bundle.h
@@ -14,6 +14,7 @@ struct bundle_header {
 	struct ref_list references;
 };
 
+int is_bundle(const char *path, int quiet);
 int read_bundle_header(const char *path, struct bundle_header *header);
 int create_bundle(struct bundle_header *header, const char *path,
 		int argc, const char **argv);
diff --git a/transport.c b/transport.c
index c9c8056..84d6480 100644
--- a/transport.c
+++ b/transport.c
@@ -913,7 +913,7 @@ struct transport *transport_get(struct remote *remote, const char *url)
 		ret->fetch = fetch_objs_via_rsync;
 		ret->push = rsync_transport_push;
 		ret->smart_options = NULL;
-	} else if (is_local(url) && is_file(url)) {
+	} else if (is_local(url) && is_file(url) && is_bundle(url, 1)) {
 		struct bundle_transport_data *data = xcalloc(1, sizeof(*data));
 		ret->data = data;
 		ret->get_refs_list = get_refs_from_bundle;
-- 
1.7.7.289.gd0d4bb

^ permalink raw reply related

* Re: [PATCH] t1402-check-ref-format: skip tests of refs beginning with slash on Windows
From: Junio C Hamano @ 2011-10-13 23:00 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: mhagger, git, Jeff King, Drew Northup, Jakub Narebski,
	Heiko Voigt, Johan Herland, Julian Phillips
In-Reply-To: <4E969BFC.50706@viscovery.net>

Johannes Sixt <j.sixt@viscovery.net> writes:

> This patch is needed on top of mh/check-ref-format-3, or it could be
> inserted in front of this batch (which probably amounts to the same
> thing :-)

How about applying directly on 'master' then?

> +invalid_ref NOT_MINGW '/'
> ...
> @@ -155,21 +155,21 @@ test_expect_success 'check-ref-format --branch from subdir' '
>  '
>  
>  valid_ref_normalized() {
> -	test_expect_success "ref name '$1' simplifies to '$2'" "
> +	test_expect_success $3 "ref name '$1' simplifies to '$2'" "
>  		refname=\$(git check-ref-format --normalize '$1') &&
>  		test \"\$refname\" = '$2'"
>  }
>  invalid_ref_normalized() {
> -	test_expect_success "check-ref-format --normalize rejects '$1'" "
> +	test_expect_success $2 "check-ref-format --normalize rejects '$1'" "
>  		test_must_fail git check-ref-format --normalize '$1'"
>  }
> ...
> +valid_ref_normalized '/heads/foo' 'heads/foo' NOT_MINGW

The inconsistencies strikes me a bit.

Perhaps update to something like this?

	valid_ref_normalized () {
		if test $# = 3
                then
			prereq=$1
                        shift
		fi
                test_expect_success $prereq "ref name '$1' simplifies to '$2'"
		...
	}

^ permalink raw reply

* Re: [PATCH] t1402-check-ref-format: skip tests of refs beginning with slash on Windows
From: Junio C Hamano @ 2011-10-13 23:07 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: mhagger, git, Jeff King, Drew Northup, Jakub Narebski,
	Heiko Voigt, Johan Herland, Julian Phillips
In-Reply-To: <7vfwiw33bf.fsf@alter.siamese.dyndns.org>

Bash on Windows converts program arguments that look like absolute POSIX
paths to their Windows form, i.e., drive-letter-colon format. For this
reason, those tests in t1402 that check refs that begin with a slash do not
work as expected on Windows: valid_ref tests are doomed to fail, and
invalid_ref tests fail for the wrong reason (that there is a colon rather
than that they begin with a slash).

Skip these tests.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

Junio C Hamano <gitster@pobox.com> writes:

>> +invalid_ref NOT_MINGW '/'
>> ...
>> +valid_ref_normalized '/heads/foo' 'heads/foo' NOT_MINGW
>
> The inconsistencies strikes me a bit.

Here is what I tentatively will queue.

The interdiff from yours looks like this:

    diff --git a/t/t1402-check-ref-format.sh b/t/t1402-check-ref-format.sh
    index dba5e97..1ae4d87 100755
    --- a/t/t1402-check-ref-format.sh
    +++ b/t/t1402-check-ref-format.sh
    @@ -11,8 +11,9 @@ valid_ref() {
                    prereq=$1
                    shift
            esac
    -	test_expect_success $prereq "ref name '$1' is valid${2:+ with options $2}" \
    -			"git check-ref-format $2 '$1'"
    +	test_expect_success $prereq "ref name '$1' is valid${2:+ with options $2}" "
    +		git check-ref-format $2 '$1'
    +	"
     }
     invalid_ref() {
            prereq=
    @@ -21,8 +22,9 @@ invalid_ref() {
                    prereq=$1
                    shift
            esac
    -	test_expect_success $prereq "ref name '$1' is invalid${2:+ with options $2}" \
    -			"test_must_fail git check-ref-format $2 '$1'"
    +	test_expect_success $prereq "ref name '$1' is invalid${2:+ with options $2}" "
    +		test_must_fail git check-ref-format $2 '$1'
    +	"
     }

     invalid_ref ''
    @@ -155,21 +157,35 @@ test_expect_success 'check-ref-format --branch from subdir' '
     '

     valid_ref_normalized() {
    -	test_expect_success $3 "ref name '$1' simplifies to '$2'" "
    +	prereq=
    +	case $1 in
    +	[A-Z]*)
    +		prereq=$1
    +		shift
    +	esac
    +	test_expect_success $prereq "ref name '$1' simplifies to '$2'" "
                    refname=\$(git check-ref-format --normalize '$1') &&
    -		test \"\$refname\" = '$2'"
    +		test \"\$refname\" = '$2'
    +	"
     }
     invalid_ref_normalized() {
    -	test_expect_success $2 "check-ref-format --normalize rejects '$1'" "
    -		test_must_fail git check-ref-format --normalize '$1'"
    +	prereq=
    +	case $1 in
    +	[A-Z]*)
    +		prereq=$1
    +		shift
    +	esac
    +	test_expect_success $prereq "check-ref-format --normalize rejects '$1'" "
    +		test_must_fail git check-ref-format --normalize '$1'
    +	"
     }

     valid_ref_normalized 'heads/foo' 'heads/foo'
     valid_ref_normalized 'refs///heads/foo' 'refs/heads/foo'
    -valid_ref_normalized '/heads/foo' 'heads/foo' NOT_MINGW
    +valid_ref_normalized NOT_MINGW '/heads/foo' 'heads/foo'
     valid_ref_normalized '///heads/foo' 'heads/foo'
     invalid_ref_normalized 'foo'
    -invalid_ref_normalized '/foo' NOT_MINGW
    +invalid_ref_normalized NOT_MINGW '/foo'
     invalid_ref_normalized 'heads/foo/../bar'
     invalid_ref_normalized 'heads/./foo'
     invalid_ref_normalized 'heads\foo'

Thanks.

 t/t1402-check-ref-format.sh |   88 +++++++++++++++++++++++++-----------------
 1 files changed, 52 insertions(+), 36 deletions(-)

diff --git a/t/t1402-check-ref-format.sh b/t/t1402-check-ref-format.sh
index 710fcca..1ae4d87 100755
--- a/t/t1402-check-ref-format.sh
+++ b/t/t1402-check-ref-format.sh
@@ -5,38 +5,40 @@ test_description='Test git check-ref-format'
 . ./test-lib.sh
 
 valid_ref() {
-	if test "$#" = 1
-	then
-		test_expect_success "ref name '$1' is valid" \
-			"git check-ref-format '$1'"
-	else
-		test_expect_success "ref name '$1' is valid with options $2" \
-			"git check-ref-format $2 '$1'"
-	fi
+	prereq=
+	case $1 in
+	[A-Z]*)
+		prereq=$1
+		shift
+	esac
+	test_expect_success $prereq "ref name '$1' is valid${2:+ with options $2}" "
+		git check-ref-format $2 '$1'
+	"
 }
 invalid_ref() {
-	if test "$#" = 1
-	then
-		test_expect_success "ref name '$1' is invalid" \
-			"test_must_fail git check-ref-format '$1'"
-	else
-		test_expect_success "ref name '$1' is invalid with options $2" \
-			"test_must_fail git check-ref-format $2 '$1'"
-	fi
+	prereq=
+	case $1 in
+	[A-Z]*)
+		prereq=$1
+		shift
+	esac
+	test_expect_success $prereq "ref name '$1' is invalid${2:+ with options $2}" "
+		test_must_fail git check-ref-format $2 '$1'
+	"
 }
 
 invalid_ref ''
-invalid_ref '/'
-invalid_ref '/' --allow-onelevel
-invalid_ref '/' --normalize
-invalid_ref '/' '--allow-onelevel --normalize'
+invalid_ref NOT_MINGW '/'
+invalid_ref NOT_MINGW '/' --allow-onelevel
+invalid_ref NOT_MINGW '/' --normalize
+invalid_ref NOT_MINGW '/' '--allow-onelevel --normalize'
 valid_ref 'foo/bar/baz'
 valid_ref 'foo/bar/baz' --normalize
 invalid_ref 'refs///heads/foo'
 valid_ref 'refs///heads/foo' --normalize
 invalid_ref 'heads/foo/'
-invalid_ref '/heads/foo'
-valid_ref '/heads/foo' --normalize
+invalid_ref NOT_MINGW '/heads/foo'
+valid_ref NOT_MINGW '/heads/foo' --normalize
 invalid_ref '///heads/foo'
 valid_ref '///heads/foo' --normalize
 invalid_ref './foo'
@@ -115,14 +117,14 @@ invalid_ref "$ref" --refspec-pattern
 invalid_ref "$ref" '--refspec-pattern --allow-onelevel'
 
 ref='/foo'
-invalid_ref "$ref"
-invalid_ref "$ref" --allow-onelevel
-invalid_ref "$ref" --refspec-pattern
-invalid_ref "$ref" '--refspec-pattern --allow-onelevel'
-invalid_ref "$ref" --normalize
-valid_ref "$ref" '--allow-onelevel --normalize'
-invalid_ref "$ref" '--refspec-pattern --normalize'
-valid_ref "$ref" '--refspec-pattern --allow-onelevel --normalize'
+invalid_ref NOT_MINGW "$ref"
+invalid_ref NOT_MINGW "$ref" --allow-onelevel
+invalid_ref NOT_MINGW "$ref" --refspec-pattern
+invalid_ref NOT_MINGW "$ref" '--refspec-pattern --allow-onelevel'
+invalid_ref NOT_MINGW "$ref" --normalize
+valid_ref NOT_MINGW "$ref" '--allow-onelevel --normalize'
+invalid_ref NOT_MINGW "$ref" '--refspec-pattern --normalize'
+valid_ref NOT_MINGW "$ref" '--refspec-pattern --allow-onelevel --normalize'
 
 test_expect_success "check-ref-format --branch @{-1}" '
 	T=$(git write-tree) &&
@@ -155,21 +157,35 @@ test_expect_success 'check-ref-format --branch from subdir' '
 '
 
 valid_ref_normalized() {
-	test_expect_success "ref name '$1' simplifies to '$2'" "
+	prereq=
+	case $1 in
+	[A-Z]*)
+		prereq=$1
+		shift
+	esac
+	test_expect_success $prereq "ref name '$1' simplifies to '$2'" "
 		refname=\$(git check-ref-format --normalize '$1') &&
-		test \"\$refname\" = '$2'"
+		test \"\$refname\" = '$2'
+	"
 }
 invalid_ref_normalized() {
-	test_expect_success "check-ref-format --normalize rejects '$1'" "
-		test_must_fail git check-ref-format --normalize '$1'"
+	prereq=
+	case $1 in
+	[A-Z]*)
+		prereq=$1
+		shift
+	esac
+	test_expect_success $prereq "check-ref-format --normalize rejects '$1'" "
+		test_must_fail git check-ref-format --normalize '$1'
+	"
 }
 
 valid_ref_normalized 'heads/foo' 'heads/foo'
 valid_ref_normalized 'refs///heads/foo' 'refs/heads/foo'
-valid_ref_normalized '/heads/foo' 'heads/foo'
+valid_ref_normalized NOT_MINGW '/heads/foo' 'heads/foo'
 valid_ref_normalized '///heads/foo' 'heads/foo'
 invalid_ref_normalized 'foo'
-invalid_ref_normalized '/foo'
+invalid_ref_normalized NOT_MINGW '/foo'
 invalid_ref_normalized 'heads/foo/../bar'
 invalid_ref_normalized 'heads/./foo'
 invalid_ref_normalized 'heads\foo'
-- 
1.7.7.289.gd0d4bb

^ permalink raw reply related

* [PATCH] pack-objects: protect against disappearing packs
From: Jeff King @ 2011-10-14  1:23 UTC (permalink / raw)
  To: git; +Cc: git-dev, Shawn O. Pearce, Nicolas Pitre

It's possible that while pack-objects is running, a
simultaneously running prune process might delete a pack
that we are interested in. Because we load the pack indices
early on, we know that the pack contains our item, but by
the time we try to open and map it, it is gone.

Since c715f78, we already protect against this in the normal
object access code path, but pack-objects accesses the packs
at a lower level.  In the normal access path, we call
find_pack_entry, which will call find_pack_entry_one on each
pack index, which does the actual lookup. If it gets a hit,
we will actually open and verify the validity of the
matching packfile (using c715f78's is_pack_valid). If we
can't open it, we'll issue a warning and pretend that we
didn't find it, causing us to go on to the next pack (or on
to loose objects).

Furthermore, we will cache the descriptor to the opened
packfile. Which means that later, when we actually try to
access the object, we are likely to still have that packfile
opened, and won't care if it has been unlinked from the
filesystem.

Notice the "likely" above. If there is another pack access
in the interim, and we run out of descriptors, we could
close the pack. And then a later attempt to access the
closed pack could fail (we'll try to re-open it, of course,
but it may have been deleted). In practice, this doesn't
happen because we tend to look up items and then access them
immediately.

Pack-objects does not follow this code path. Instead, it
accesses the packs at a much lower level, using
find_pack_entry_one directly. This means we skip the
is_pack_valid check, and may end up with the name of a
packfile, but no open descriptor.

We can add the same is_pack_valid check here. Unfortunately,
the access patterns of pack-objects are not quite as nice
for keeping lookup and object access together. We look up
each object as we find out about it, and the only later when
writing the packfile do we necessarily access it. Which
means that the opened packfile may be closed in the interim.

In practice, however, adding this check still has value, for
three reasons.

  1. If you have a reasonable number of packs and/or a
     reasonable file descriptor limit, you can keep all of
     your packs open simultaneously. If this is the case,
     then the race is impossible to trigger.

  2. Even if you can't keep all packs open at once, you
     may end up keeping the deleted one open (i.e., you may
     get lucky).

  3. The race window is shortened. You may notice early that
     the pack is gone, and not try to access it. Triggering
     the problem without this check means deleting the pack
     any time after we read the list of index files, but
     before we access the looked-up objects.  Triggering it
     with this check means deleting the pack means deleting
     the pack after we do a lookup (and successfully access
     the packfile), but before we access the object. Which
     is a smaller window.
---
We're seeing this at GitHub because we prune pretty
aggressively. We let pushes go into individual repositories,
but then we kick off a job to move the resulting objects
into the repository's "network" repo, which is basically a
big alternates repository for related repos.

You can reproduce locally with:

    push() {
      (cd child &&
       echo content >>file &&
       git add file &&
       git commit -q -m "change `wc -l <file`" &&
       git push -q origin HEAD
      ) &&
      (cd network.git &&
       git fetch -q ../parent refs/*:refs/* &&
       git gc --auto
      ) &&
      (cd parent.git &&
       git repack -qadl
      )
    }

    fetch() {
      rm -rf clone$1
      git.compile clone -q file://$PWD/parent.git clone$1 2>>output$1 ||
      {
        echo >&2 FAIL FAIL FAIL
        exit 1
      }
    }

    git init --bare network.git &&
    git --git-dir=network.git config transfer.unpacklimit 1 &&
    git init --bare parent.git &&
    git --git-dir=parent.git config transfer.unpacklimit 1 &&
    echo "$PWD/network.git/objects" >parent.git/objects/info/alternates
    git clone parent.git child &&
    (for i in `seq 1 1000`; do push; done) &
    (for i in `seq 1 1000`; do fetch 1; done) &
    (for i in `seq 1 1000`; do fetch 2; done) &
    wait

One process repeatedly pushes new packs, then migrates the
objects over to the network repo. Simultaneously, two
processes are constantly cloning. After 10-20 seconds, I'll
usually get unlucky and a clone will fail (because the
remote pack-objects complains about the inaccessible
packfile).

Another option would be to open the pack at point of use, and if that
fails, try to find the object elsewhere. This throws off the rest of the
code a bit, though, as we were expecting to get it from a pack, which
went into our decision about what we should delta.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/pack-objects.c |    4 ++++
 cache.h                |    1 +
 sha1_file.c            |    2 +-
 3 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 2b18de5..8681ccd 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -804,6 +804,10 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
 		off_t offset = find_pack_entry_one(sha1, p);
 		if (offset) {
 			if (!found_pack) {
+				if (!is_pack_valid(p)) {
+					error("packfile %s cannot be accessed", p->pack_name);
+					continue;
+				}
 				found_offset = offset;
 				found_pack = p;
 			}
diff --git a/cache.h b/cache.h
index e39e160..495b468 100644
--- a/cache.h
+++ b/cache.h
@@ -1055,6 +1055,7 @@ struct extra_have_objects {
 extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t);
 extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t);
 extern off_t find_pack_entry_one(const unsigned char *, struct packed_git *);
+extern int is_pack_valid(struct packed_git *);
 extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
 extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
 extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
diff --git a/sha1_file.c b/sha1_file.c
index 3401301..a22c5b4 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1987,7 +1987,7 @@ off_t find_pack_entry_one(const unsigned char *sha1,
 	return 0;
 }

-static int is_pack_valid(struct packed_git *p)
+int is_pack_valid(struct packed_git *p)
 {
 	/* An already open pack is known to be valid. */
 	if (p->pack_fd != -1)
-- 
1.7.6.4.37.g43b58b

^ permalink raw reply related

* Re: [PATCH] pack-objects: protect against disappearing packs
From: Jeff King @ 2011-10-14  1:31 UTC (permalink / raw)
  To: git; +Cc: git-dev, Shawn O. Pearce, Nicolas Pitre
In-Reply-To: <20111014012320.GA4395@sigill.intra.peff.net>

On Thu, Oct 13, 2011 at 09:23:20PM -0400, Jeff King wrote:

> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 2b18de5..8681ccd 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -804,6 +804,10 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
>  		off_t offset = find_pack_entry_one(sha1, p);
>  		if (offset) {
>  			if (!found_pack) {
> +				if (!is_pack_valid(p)) {
> +					error("packfile %s cannot be accessed", p->pack_name);
> +					continue;
> +				}

This message is modeled after the one in find_pack_entry. However,
they're not really errors, since we will try to find the object
elsewhere (and generally succeed). So the messages could just go away.
Though they can also alert you to something fishy going on (like a
packfile with bad permissions). But perhaps we should downgrade them
like this:

-- >8 --
Subject: [PATCH] downgrade "packfile cannot be accessed" errors to warnings

These can happen if another process simultaneously prunes a
pack. But that is not usually an error condition, because a
properly-running prune should have repacked the object into
a new pack. So we will notice that the pack has disappeared
unexpectedly, print a message, try other packs (possibly
after re-scanning the list of packs), and find it in the new
pack.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/pack-objects.c |    2 +-
 sha1_file.c            |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 8681ccd..ba3705d 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -805,7 +805,7 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
 		if (offset) {
 			if (!found_pack) {
 				if (!is_pack_valid(p)) {
-					error("packfile %s cannot be accessed", p->pack_name);
+					warning("packfile %s cannot be accessed", p->pack_name);
 					continue;
 				}
 				found_offset = offset;
diff --git a/sha1_file.c b/sha1_file.c
index a22c5b4..27f3b9b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2038,7 +2038,7 @@ static int find_pack_entry(const unsigned char *sha1, struct pack_entry *e)
 			 * was loaded!
 			 */
 			if (!is_pack_valid(p)) {
-				error("packfile %s cannot be accessed", p->pack_name);
+				warning("packfile %s cannot be accessed", p->pack_name);
 				goto next;
 			}
 			e->offset = offset;
-- 
1.7.6.4.37.g43b58b

^ permalink raw reply related

* Re: [BUG] git checkout <branch> allowed with uncommitted changes
From: Jeff King @ 2011-10-14  1:38 UTC (permalink / raw)
  To: arQon; +Cc: git
In-Reply-To: <loom.20111013T203610-130@post.gmane.org>

On Thu, Oct 13, 2011 at 06:56:14PM +0000, arQon wrote:

> I'll give a shot, though I don't know how good it'll be. Off the top of my
> head, I don't see any good way to explain the inconsistency with LOCAL CHANGES
> sometimes preventing switches and sometimes not, based on what is to the user
> an arbitrary set of rules that has nothing to do with the *current state* of
> the worktree, but rather the state of those files in prior commits.

The rules are fairly straightforward.  You are moving from branch A to
branch B. If path X is not changed going from A to B, git will not touch
it, whether or not you have local changes. If path X is changed going
from A to B, then git will refuse the checkout, and you have the option
of:

  1. checkout -f: overwrite your local changes with what's in B

  2. checkout -m: merge your changes with what's in B (using A as a
                  common ancestor)

> But sure, I'll see if I can come up with something. If nothing else,
> having the manpage at least explain what "M" means; that it can be
> potentially disastrous; and what you need to do to avoid it, would be
> a definite plus.

You keep saying things like "disastrous". Git's rules are specifically
designed to be as flexible as possible without allowing the checkout
command to cause data loss.

Do you actually have a case that causes irrecoverable data loss?

Note that I don't count "these changes were based on A, now they are
based on B" as data loss. Your file content is still completely intact,
and if you want to make them based on "A" again, you just need to
"git checkout A" again.

-Peff

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox