Git development

Git development
 help / color / mirror / Atom feed

* Re: How to use git attributes to configure server-side checks?
From: Jeff King @ 2011-09-22 21:04 UTC (permalink / raw)
  To: Jay Soffian; +Cc: Michael Haggerty, Junio C Hamano, git discussion list
In-Reply-To: <20110922205856.GA8563@sigill.intra.peff.net>

On Thu, Sep 22, 2011 at 04:58:56PM -0400, Jeff King wrote:

> That makes some sense to me. As Junio pointed out, there is a catch with
> "diff -R". In that case, I would still think you would use the "second"
> commit, even though we're reversing the diff. So:
> 
>   git diff A B
> 
> would not be exactly equivalent to:
> 
>   git diff -R B A
> 
> in that the second would use attributes from "A" instead of "B".

I misread Junio's comment a bit. Re-reading it, this is exactly the
inconsistency he complained about. However, I consider it somewhat of a
feature. We currently have two ways to express the same thing, and you
arrive at one or the other based on the way you are thinking of the
problem. But we can use that to disambiguate between the two cases; one
is about going from A to B, and one is about inverting the operation of
going from B to A. Right now they're equivalent, but they don't have to
be.

If you read the rest of my message, you will see that I think picking
"first" or "second" arbitrarily like this might be barking up the wrong
tree.  But I just wanted to clarify that point.

-Peff

^ permalink raw reply

* Re: How to use git attributes to configure server-side checks?
From: Jeff King @ 2011-09-22 20:58 UTC (permalink / raw)
  To: Jay Soffian; +Cc: Michael Haggerty, Junio C Hamano, git discussion list
In-Reply-To: <CAG+J_DxdP2qHhttJOtWQTKeiDV2YbC_A_F+b9sDOZsWhWxjcjw@mail.gmail.com>

On Thu, Sep 22, 2011 at 02:41:42PM -0400, Jay Soffian wrote:

> attr.c says:
> 
>   a. built-in attributes
>   b. $(prefix)/etc/gitattributes unless GIT_ATTR_NOSYSTEM is set
>   c. core.attributesfile
>   d. per-directory .gitattributes
>   e. $GIT_DIR/info/attributes
> 
> The mechanics of (d) are established by git_attr_direction:
> 
>   GIT_ATTR_CHECKIN: working-copy, then index
>   GIT_ATTR_CHECKOUT: index, then working-copy
>   GIT_ATTR_INDEX: index-only

Thanks for hunting it down. I had been thinking that "attributes from
tree" would come either before or after (d) above, but the concept
really fits better into the second list (i.e., they're per-directory
attributes, it's just a matter of which set of directories).

> Where GIT_ATTR_CHECKIN is the default direction and GIT_ATTR_CHECKOUT
> is used while checking-out files from the index. (GIT_ATTR_INDEX is
> used only by git-archive.)

Hrm. But git archive works correctly in a bare repo with no index at
all. It looks like we just populate an in-core index and feed that to
git_attr_set_direction. Which is a bit suboptimal for something like
diff, which might not need to look at all parts of the tree (I guess it
is similarly suboptimal for archive with pathspecs).

But still, those are just performance considerations. We could use the
same trick for diff/log in a bare repo. I guess we'd end up refreshing
the index from each commit in a multi-commit log, which might be
noticeably slow.

> Consistent with that, when comparing two commits (diff-tree), I think
> you look at the .gitattributes in the second commit.

That makes some sense to me. As Junio pointed out, there is a catch with
"diff -R". In that case, I would still think you would use the "second"
commit, even though we're reversing the diff. So:

  git diff A B

would not be exactly equivalent to:

  git diff -R B A

in that the second would use attributes from "A" instead of "B".

However, I think this is skirting around a somewhat larger issue, which
is that gitattributes are sometimes about "this is what the file is like
at the current state of history", and sometimes about "this is what this
file is like throughout history".

For example, imagine I've got a repository of binary files. At some
point, I decide to use gitattributes to configure a textconv filter. In
my non-bare repo, when I do "git log" I get nice textconv'd diffs going
back through all of history. But if I push to a bare repo and try to do
a "git log" there, my attributes are ignored. So in this case, I like
that the working tree's attributes are applied retroactively, and I'd
like the bare repo to do the same.

Now consider a different example. I have a repo with a script "foo" that
contains perl code, and I add .gitattributes marking it as such (for
func headers, or maybe I have a magic external diff, or whatever[1]).
Later, I decide that I hate perl and rewrite it in python, updating
gitattributes. Now I _don't_ want the retroactive behavior. When I do a
"git log -p", I want to see the old commits with the perl script shown
using the perl attribute, not the python. But what the heck would it
mean to diff a recent python commit against an old perl one?

Even though we think of the diff attributes as "here's how you diff",
they are really "here are some annotations about the end points". So for
something like a textconv, you care about the attributes of _both_
sides of a diff, applying the attributes of the "from" tree to the paths
in that tree. Something like funcname is harder. I guess you'd probably
want to use the attributes from the "from" tree, since it is about
reporting context from the "from" side.

So the semantics really end up depending on the particular attribute.
Something like "-delta" exists more outside the history or the state of
a particular commit.

So I think doing it "right" would be a lot more complicated than just
reading the commits from a particular tree. I think it would mean adding
more context to all attribute lookups, and having each caller decide how
the results should be used.

However, retroactively applying attributes from the working tree, even
though it is sometimes wrong (e.g., we get the perl/python thing wrong
_now_ if you have a working tree), is often convenient in practice
(because otherwise you end up duplicating your per-directory
gitattributes in your info/attributes file), and rarely is it actually
wrong (because changing the type of a file without changing its path
isn't all that common).

So if the status quo with working trees (i.e., retroactively applying
the current gitattributes to past commits) is good enough, perhaps the
bare-repository solution should be modeled the same way? In other words,
should "git log -p" in a bare repo be looking for attributes not in the
commits being diffed, but rather in HEAD[2]?

That at least would make it consistent with the non-bare behavior. And
then if we want to move forward with making particular attributes more
aware of their context, we can do it in _both_ the bare and non-bare
cases.

-Peff

[1] The perl/python thing is probably not a huge deal, as funcnames are
    the most likely thing to configure. But you can imagine it would be
    much worse if some binary file changes formats, and you were using
    textconv. Or something with word-diff, perhaps.

[2] You can almost do this with:

      git show HEAD:.gitattributes >info/attributes
      git show commit-you-care-about
      rm info/attributes

    except that it won't handle any per-directory attributes files from
    subdirectories. So I think you'd want a "--attributes-from=HEAD"
    diff option or something similar.

^ permalink raw reply

* Re: [PATCH v2 1/2] sparse checkout: show error messages when worktree shaping fails
From: Joshua Jensen @ 2011-09-22 19:57 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, Junio C Hamano, Michael J Gruber
In-Reply-To: <1316690663-29382-1-git-send-email-pclouds@gmail.com>

----- Original Message -----
From: Nguyễn Thái Ngọc Duy
Date: 9/22/2011 5:24 AM
> diff --git a/unpack-trees.c b/unpack-trees.c
> index cc616c3..fcf40a0 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1089,6 +1089,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>   		 */
>   		mark_new_skip_worktree(o->el,&o->result, CE_ADDED, CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE);
>
> +		ret = 0;
>   		for (i = 0; i<  o->result.cache_nr; i++) {
>   			struct cache_entry *ce = o->result.cache[i];
>
> @@ -1101,17 +1102,23 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>   			 * correct CE_NEW_SKIP_WORKTREE
>   			 */
>   			if (ce->ce_flags&  CE_ADDED&&
> -			    verify_absent(ce, ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN, o))
> -					return -1;
> +			    verify_absent(ce, ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN, o)) {
> +				if (!o->show_all_errors)
> +					goto return_failed;
> +				ret = -1;
> +			}
>
>   			if (apply_sparse_checkout(ce, o)) {
> +				if (!o->show_all_errors)
> +					goto return_failed;
>   				ret = -1;
> -				goto done;
>   			}
>   			if (!ce_skip_worktree(ce))
>   				empty_worktree = 0;
>
>   		}
> +		if (ret<  0)
> +			goto return_failed;
>   		if (o->result.cache_nr&&  empty_worktree) {
>   			/* dubious---why should this fail??? */
>   			ret = unpack_failed(o, "Sparse checkout leaves no entry on working directory");
I can confirm that this version of the patch works for me with multiple 
untracked files in a sparse checkout.

-Josh

^ permalink raw reply

* [PATCH 2/2] fast-import: don't allow to note on empty branch
From: Dmitry Ivankov @ 2011-09-22 19:47 UTC (permalink / raw)
  To: git
  Cc: Jonathan Nieder, Shawn O. Pearce, David Barr, Sverre Rabbelier,
	Dmitry Ivankov
In-Reply-To: <1316720825-32552-1-git-send-email-divanorama@gmail.com>

'reset' command makes fast-import start a branch from scratch. It's name
is kept in lookup table but it's sha1 is null_sha1 (special value).
'notemodify' command can be used to add a note on branch head given it's
name. lookup_branch() is used it that case and it doesn't check for
null_sha1. So fast-import writes a note for null_sha1 object instead of
giving a error.

Add a check to deny adding a note on empty branch and add a corresponding
test.

Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
---
 fast-import.c          |    2 ++
 t/t9300-fast-import.sh |   17 +++++++++++++++++
 2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index c44cc11..a8a3ad1 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -2416,6 +2416,8 @@ static void note_change_n(struct branch *b, unsigned char old_fanout)
 	/* <committish> */
 	s = lookup_branch(p);
 	if (s) {
+		if (is_null_sha1(s->sha1))
+			die("Can't add a note on empty branch.");
 		hashcpy(commit_sha1, s->sha1);
 	} else if (*p == ':') {
 		uintmax_t commit_mark = strtoumax(p + 1, NULL, 10);
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 0b97d7a..bd32b91 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -1987,6 +1987,23 @@ test_expect_success \
 	'Q: verify second note for second commit' \
 	'git cat-file blob refs/notes/foobar:$commit2 >actual && test_cmp expect actual'
 
+cat >input <<EOF
+reset refs/heads/Q0
+
+commit refs/heads/note-Q0
+committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+data <<COMMIT
+Note for an empty branch.
+COMMIT
+
+N inline refs/heads/Q0
+data <<NOTE
+some note
+NOTE
+EOF
+test_expect_success \
+	'Q: deny note on empty branch' \
+	'test_must_fail git fast-import <input'
 ###
 ### series R (feature and option)
 ###
-- 
1.7.3.4

^ permalink raw reply related

* [PATCH 1/2] fast-import: don't allow to tag empty branch
From: Dmitry Ivankov @ 2011-09-22 19:47 UTC (permalink / raw)
  To: git
  Cc: Jonathan Nieder, Shawn O. Pearce, David Barr, Sverre Rabbelier,
	Dmitry Ivankov
In-Reply-To: <1316720825-32552-1-git-send-email-divanorama@gmail.com>

'reset' command makes fast-import start a branch from scratch. It's name
is kept in lookup table but it's sha1 is null_sha1 (special value).
'tag' command can be used to tag a branch by it's name. lookup_branch()
is used it that case and it doesn't check for null_sha1. So fast-import
writes a tag for null_sha1 object instead of giving a error.

Add a check to deny tagging an empty branch and add a corresponding test.

Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
---
 fast-import.c          |    2 ++
 t/t9300-fast-import.sh |   12 ++++++++++++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index 742e7da..c44cc11 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -2717,6 +2717,8 @@ static void parse_new_tag(void)
 	from = strchr(command_buf.buf, ' ') + 1;
 	s = lookup_branch(from);
 	if (s) {
+		if (is_null_sha1(s->sha1))
+			die("Can't tag an empty branch.");
 		hashcpy(sha1, s->sha1);
 		type = OBJ_COMMIT;
 	} else if (*from == ':') {
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 1a6c066..0b97d7a 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -820,6 +820,18 @@ test_expect_success \
 	'test 1 = `git rev-list J | wc -l` &&
 	 test 0 = `git ls-tree J | wc -l`'
 
+cat >input <<INPUT_END
+reset refs/heads/J2
+
+tag wrong_tag
+from refs/heads/J2
+data <<EOF
+Tag branch that was reset.
+EOF
+INPUT_END
+test_expect_success \
+	'J: tag must fail on empty branch' \
+	'test_must_fail git fast-import <input'
 ###
 ### series K
 ###
-- 
1.7.3.4

^ permalink raw reply related

* [PATCH 0/2] fast-import: empty/reset branch bugs
From: Dmitry Ivankov @ 2011-09-22 19:47 UTC (permalink / raw)
  To: git
  Cc: Jonathan Nieder, Shawn O. Pearce, David Barr, Sverre Rabbelier,
	Dmitry Ivankov

fast-import uses null_sha1 for empty branches. It doesn't make a 
null_sha1 parent nor writes out a branch with null_sha1 head. But
for notemodify and tag commands there is no check for null_sha1
and so bad tag/notes are produced instead of a error message.

Dmitry Ivankov (2):
  fast-import: don't allow to tag empty branch
  fast-import: don't allow to note on empty branch

 fast-import.c          |    4 ++++
 t/t9300-fast-import.sh |   29 +++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 0 deletions(-)

-- 
1.7.3.4

^ permalink raw reply

* Re: any way to "re-sync" a bare repository against another bare repository?
From: Chris Friesen @ 2011-09-22 19:23 UTC (permalink / raw)
  To: Michael Witten; +Cc: git
In-Reply-To: <7142366f54c44cea82542adf8aea5bb9-mfwitten@gmail.com>

Thanks for that very thorough response.

The rationale for this is somewhat convoluted...I have a vendor-supplied 
build system that expects to be pointed at a bare repository.  For 
performance reasons I want to have a local bare repo on each build 
machine that can be kept in sync with a master repo on a main server.

Chris


On 09/22/2011 12:50 PM, Michael Witten wrote:
> On Thu, 22 Sep 2011 11:22:37 -0600, Chris Friesen wrote:
>
>> Suppose I have a parent bare repository.  I do a "git clone --bare" to
>> create a child repository, and then clone the child to create a
>> grandchild repository.
>
> Firstly, what exactly is the scenario you are trying to achieve? Perhaps
> there is a better way to do what you are trying to do.
>
>> If changes get pushed into the parent repository, is there any way to
>> cause the child to be updated?
>
> The documentation answers your question (but badly, as with much of the
> documentation); from `git help clone':
>
>    --bare
>        Make a bare GIT repository. That is, instead of creating
>        <directory>  and placing the administrative files in
>        <directory>/.git, make the<directory>  itself the $GIT_DIR.
>        This obviously implies the -n because there is nowhere to
>        check out the working tree. Also the branch heads at the
>        remote are copied directly to corresponding local branch
>        heads, without mapping them to refs/remotes/origin/. When
>        this option is used, neither remote-tracking branches nor
>        the related configuration variables are created.
>
> In particular:
>
>                                    Also the branch heads at the
>        remote are copied directly to corresponding local branch
>        heads, without mapping them to refs/remotes/origin/. When
>        this option is used, neither remote-tracking branches nor
>        the related configuration variables are created.
>
> In particular:
>
>                                                             When
>        this option is used, neither remote-tracking branches nor
>        the related configuration variables are created.
>
> Thus, you have to explicitly tell git what you fetched and which
> branch heads should be updated.
>
> Consider this:
>
>    $ git init parent
>    $ git clone        parent child0
>    $ git clone --bare parent child1
>
> Now, look at the config file for the child0 repository:
>
>    $ cat child0/.git/config
>    [core]
>            repositoryformatversion = 0
>            filemode = true
>            bare = false
>            logallrefupdates = true
>    [remote "origin"]
>            fetch = +refs/heads/*:refs/remotes/origin/*
>            url = /path/to/parent
>    [branch "master"]
>            remote = origin
>            merge = refs/heads/master
>
> In particular:
>
>            fetch = +refs/heads/*:refs/remotes/origin/*
>
> That is the default `refspec'; when `git fetch' is not explicitly
> told on the command line what to fetch and which branch head[s] to
> update, then this refspec is used as a default.
>
> Now, look at the config file for the child1 repository:
>
>    $ cat child1/config
>    [core]
>            repositoryformatversion = 0
>            filemode = true
>            bare = true
>    [remote "origin"]
>            url = /path/to/parent
>
> In particular, note that a bare repository doesn't include any
> such default information for `git fetch'. However, you could be
> explicit about it; from within the chidl1 repo:
>
>    $ git fetch origin '+refs/heads/*:refs/remotes/origin/*'
>
>> Just a "git fetch<parent>" doesn't seem to help.  If I set up parent as
>> a remote branch I can fetch it,
>
> Firstly, it doesn't make any sense to say "set up parent as a remote
> branch"; what you mean is "set up `<parent>' as a remote with a default
> refspec that creates any associated remote-tracking branch heads".
>
> Secondly, by setting up `<parent>' as a remote, you are creating the
> missing refspec in your config file:
>
>    [remote "<parent>"]
>            url = /path/to/parent
>            fetch = +refs/heads/*:refs/remotes/<parent>/*
>
> which is why you get this:
>
>> but then it shows all the branches as "<parent>/<branch>" rather
>> than updating the child.
>
> You need a different refspec, namely:
>
>    +refs/heads/*:refs/heads/*
>
> So, either be explicit:
>
>    git fetch '<parent>' '+refs/heads/*:refs/heads/*'
>
> or update your config:
>
>    git config 'remote.<parent>.fetch' '+refs/heads/*:refs/heads/*'
>
> Of course, there is an easier way that does all of this [and more]
> for you:
>
>> I just tried a "git clone --mirror" to create the child and it seems to
>> allow me to pick up changes in the parent via "git fetch".  Is that the
>> proper way to handle this?
>
> The documentation answers your question (but badly, as with much of the
> documentation); from `git help clone':
>
>    --mirror
>        Set up a mirror of the source repository. This implies
>        --bare. Compared to --bare, --mirror not only maps local
>        branches of the source to local branches of the target, it
>        maps all refs (including remote-tracking branches, notes
>        etc.) and sets up a refspec configuration such that all
>        these refs are overwritten by a git remote update in the
>        target repository.
>
> In particular:
>
>                  sets up a refspec configuration such that all
>        these refs are overwritten by a git remote update in the
>        target repository.
>
> Consider this:
>
>    $ git clone --mirror parent child2
>    $ cat child2/config
>    [core]
>            repositoryformatversion = 0
>            filemode = true
>            bare = true
>    [remote "origin"]
>            fetch = +refs/*:refs/*
>            mirror = true
>            url = /path/to/parent
>
> In particular:
>
>            fetch = +refs/*:refs/*
>
> That's a very liberal refspec! It basically says that fetch
> should mirror everything by default.
>
> Sincerely,
> Michael Witten


-- 
Chris Friesen
Software Developer
GENBAND
chris.friesen@genband.com
www.genband.com

^ permalink raw reply

* Re: How to use git attributes to configure server-side checks?
From: Junio C Hamano @ 2011-09-22 19:22 UTC (permalink / raw)
  To: Jay Soffian; +Cc: Jeff King, Michael Haggerty, git discussion list
In-Reply-To: <CAG+J_DxdP2qHhttJOtWQTKeiDV2YbC_A_F+b9sDOZsWhWxjcjw@mail.gmail.com>

Jay Soffian <jaysoffian@gmail.com> writes:

> Consistent with that, when comparing two commits (diff-tree), I think
> you look at the .gitattributes in the second commit.

That would make "diff A B" and "diff -R B A" behave differently.

^ permalink raw reply

* Re: any way to "re-sync" a bare repository against another bare repository?
From: Michael Witten @ 2011-09-22 18:50 UTC (permalink / raw)
  To: Chris Friesen; +Cc: git
In-Reply-To: <4E7B6EDD.1040106@genband.com>

On Thu, 22 Sep 2011 11:22:37 -0600, Chris Friesen wrote:

> Suppose I have a parent bare repository.  I do a "git clone --bare" to 
> create a child repository, and then clone the child to create a 
> grandchild repository.

Firstly, what exactly is the scenario you are trying to achieve? Perhaps
there is a better way to do what you are trying to do.

> If changes get pushed into the parent repository, is there any way to 
> cause the child to be updated?

The documentation answers your question (but badly, as with much of the
documentation); from `git help clone':

  --bare
      Make a bare GIT repository. That is, instead of creating
      <directory> and placing the administrative files in
      <directory>/.git, make the <directory> itself the $GIT_DIR.
      This obviously implies the -n because there is nowhere to
      check out the working tree. Also the branch heads at the
      remote are copied directly to corresponding local branch
      heads, without mapping them to refs/remotes/origin/. When
      this option is used, neither remote-tracking branches nor
      the related configuration variables are created.

In particular:

                                  Also the branch heads at the
      remote are copied directly to corresponding local branch
      heads, without mapping them to refs/remotes/origin/. When
      this option is used, neither remote-tracking branches nor
      the related configuration variables are created.

In particular:

                                                           When
      this option is used, neither remote-tracking branches nor
      the related configuration variables are created.

Thus, you have to explicitly tell git what you fetched and which
branch heads should be updated.

Consider this:

  $ git init parent
  $ git clone        parent child0
  $ git clone --bare parent child1

Now, look at the config file for the child0 repository:

  $ cat child0/.git/config 
  [core]
          repositoryformatversion = 0
          filemode = true
          bare = false
          logallrefupdates = true
  [remote "origin"]
          fetch = +refs/heads/*:refs/remotes/origin/*
          url = /path/to/parent
  [branch "master"]
          remote = origin
          merge = refs/heads/master

In particular:

          fetch = +refs/heads/*:refs/remotes/origin/*

That is the default `refspec'; when `git fetch' is not explicitly
told on the command line what to fetch and which branch head[s] to
update, then this refspec is used as a default.

Now, look at the config file for the child1 repository:

  $ cat child1/config
  [core]
          repositoryformatversion = 0
          filemode = true
          bare = true
  [remote "origin"]
          url = /path/to/parent

In particular, note that a bare repository doesn't include any
such default information for `git fetch'. However, you could be
explicit about it; from within the chidl1 repo:

  $ git fetch origin '+refs/heads/*:refs/remotes/origin/*'

> Just a "git fetch <parent>" doesn't seem to help.  If I set up parent as 
> a remote branch I can fetch it,

Firstly, it doesn't make any sense to say "set up parent as a remote
branch"; what you mean is "set up `<parent>' as a remote with a default
refspec that creates any associated remote-tracking branch heads".

Secondly, by setting up `<parent>' as a remote, you are creating the
missing refspec in your config file:

  [remote "<parent>"]
          url = /path/to/parent
          fetch = +refs/heads/*:refs/remotes/<parent>/*

which is why you get this:

> but then it shows all the branches as "<parent>/<branch>" rather
> than updating the child.

You need a different refspec, namely:

  +refs/heads/*:refs/heads/*

So, either be explicit:

  git fetch '<parent>' '+refs/heads/*:refs/heads/*'

or update your config:

  git config 'remote.<parent>.fetch' '+refs/heads/*:refs/heads/*'

Of course, there is an easier way that does all of this [and more]
for you:

> I just tried a "git clone --mirror" to create the child and it seems to 
> allow me to pick up changes in the parent via "git fetch".  Is that the 
> proper way to handle this?

The documentation answers your question (but badly, as with much of the
documentation); from `git help clone':

  --mirror
      Set up a mirror of the source repository. This implies
      --bare. Compared to --bare, --mirror not only maps local
      branches of the source to local branches of the target, it
      maps all refs (including remote-tracking branches, notes
      etc.) and sets up a refspec configuration such that all
      these refs are overwritten by a git remote update in the
      target repository.

In particular:

                sets up a refspec configuration such that all
      these refs are overwritten by a git remote update in the
      target repository.

Consider this:

  $ git clone --mirror parent child2
  $ cat child2/config
  [core]
          repositoryformatversion = 0
          filemode = true
          bare = true
  [remote "origin"]
          fetch = +refs/*:refs/*
          mirror = true
          url = /path/to/parent

In particular:

          fetch = +refs/*:refs/*

That's a very liberal refspec! It basically says that fetch
should mirror everything by default.

Sincerely,
Michael Witten

^ permalink raw reply

* Re: How to use git attributes to configure server-side checks?
From: Jay Soffian @ 2011-09-22 18:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, Junio C Hamano, git discussion list
In-Reply-To: <20110922171340.GA2934@sigill.intra.peff.net>

On Thu, Sep 22, 2011 at 1:13 PM, Jeff King <peff@peff.net> wrote:
>
> No, we definitely don't use in-tree gitattributes. IIRC, there are some
> precedence and ordering questions. I think the ordering now is:
>
>  1. Look in $GIT_DIR/info/attributes
>
>  2. If not found, look in per-directory .gitattributes
>
>  3. If not found, look in core.gitattributesfile
>
> Where do in-tree attributes fit in? Between 1 and 2? Or 2 and 3? And
> which tree do we look at?

attr.c says:

  a. built-in attributes
  b. $(prefix)/etc/gitattributes unless GIT_ATTR_NOSYSTEM is set
  c. core.attributesfile
  d. per-directory .gitattributes
  e. $GIT_DIR/info/attributes

The mechanics of (d) are established by git_attr_direction:

  GIT_ATTR_CHECKIN: working-copy, then index
  GIT_ATTR_CHECKOUT: index, then working-copy
  GIT_ATTR_INDEX: index-only

Where GIT_ATTR_CHECKIN is the default direction and GIT_ATTR_CHECKOUT
is used while checking-out files from the index. (GIT_ATTR_INDEX is
used only by git-archive.)

Note that (d) only occurs in non-bare repos or if direction is GIT_ATTR_INDEX.

> Here are some examples:
>
>  a. If I do "git checkout branch-foo", we should look at branch-foo's
>     tree for things like crlf, right?  Do we still fall back to
>     per-directory .gitattributes in the working tree? On the one hand,
>     they're not really relevant to the commit we're moving to. But they
>     are respected in the current code, and can be useful when moving to
>     old commits which lack attributes.
>
>     I think this is where the index magic comes in in the current code
>     (we do something like "load the index, then respect gitattributes
>     from the index").
>
>     So maybe this is solved already.
>
>  b. You're diffing commit $a against commit $b. Whose gitattributes
>     have precedence? Is order important? Are gitattributes in the
>     working tree and index relevant?
>
>  c. You're diffing commit $a against HEAD. Which gitattributes have
>     precedence? Again, is order important (i.e., does "diff -R" look at
>     different attributes than "diff")?
>
> I'm sure there are others, too. And I don't think any of these is
> insurmountable. But somebody needs to think through a lot of cases and
> come up with consistent, sane behavior that does the right thing in each
> case (considering both bare repositories and ones with working trees).

I think that diff-index --cached or when there is no working tree,
should set the direction to GIT_ATTR_INDEX so that it may be used with
a bare repo. In such case, the .gitattributes would come from the
index.

Consistent with that, when comparing two commits (diff-tree), I think
you look at the .gitattributes in the second commit.

j.

^ permalink raw reply

* --simplify-by-decoration, but include branch points
From: Andrew Pimlott @ 2011-09-22 18:24 UTC (permalink / raw)
  To: git

The --simplify-by-decoration option to git log is a great way to view
branch topology.  However, it is a bit misleading because it does not
necessarily show branch points.  For example, I have a repository that
looks like:

    * 6045d25 (HEAD, master) 3
    | * 8daa592 (branch) 2.1
    |/
    * a4da73a 2
    * 014106d (tag: v1) 1

This is from "git log --decorate --all --graph --oneline".  If I add
--simplify-by-decoration, I get

    * 6045d25 (HEAD, master) 3
    | * 8daa592 (branch) 2.1
    |/
    * 014106d (tag: v1) 1

Note it appears as though the branch point is 014106d, when it's really
014106d.  I would love to see an option like --simplify-by-decoration
that also selects branch points for display, maybe
--simplify-by-branch-point.  (It should be possible to combine it with
--simplify-by-decoration.)

Is there anything like this?

Andrew

^ permalink raw reply

* Re: Bug: git log --numstat counts wrong
From: Junio C Hamano @ 2011-09-22 17:51 UTC (permalink / raw)
  To: Alexander Pepper; +Cc: git, Tay Ray Chuan
In-Reply-To: <7vobyd1vmo.fsf@alter.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> writes:

> Alexander Pepper <pepper@inf.fu-berlin.de> writes:
>
>> Am 21.09.2011 um 14:24 schrieb Junio C Hamano:
>>>> $ git log --numstat 48a07e7e533f507228e8d1c99d4d48e175e14260
>>>> [...]
>>>> 11      10      src/java/voldemort/server/storage/StorageService.java
>>> 
>>> Didn't we update it this already? I seem to get 10/9 here not 11/10.
>>
>> Current 'maint' (cd2b8ae9), 'master' (4b5eac7f)...
>
> That's a tad old master you seem to have.
>
> Strangely, bisection points at 27af01d5523, which was supposed to be only
> about performance and never about correctness. There is something fishy
> going on....

In any case, I think the real issue is that depending on how much context
you ask, the resulting diff is different (and both are valid diffs). If
you ask "log -p" (or "diff" or "show") to produce a patch, then we use the
default 3-line context. And then you feed that to an external diffstat to
count the number of deleted and added lines to get one set of numbers.

The --numstat (and --diffstat) code seems to be running the internal diff
machinery with 0-line context and counting the resulting diff internally.

And of course the results between the above two would be different because
diff can match lines differently when given different number of context
lines to include in the result.

So perhaps a good sanity-check for you to try (note: not checking your
sanity, but checking the sanity of the above analysis) would be to do:

  $ git show 48a07e7e53 -- $that_path | diffstat
  $ git show -U0 48a07e7e53 -- $that_path | diffstat
  $ git show --numstat 48a07e7e53 -- $that_path
  $ git show --stat 48a07e7e53 -- $that_path

and see how they compare (make sure to use the same version of git for
these experiments). The first one uses the default 3-lines context, the
second one forces 0-line context, and the last two uses 0-line context
hardwired in the code.

Applying the following patch should make the last two use the default
context or -U$num given from the command line to be consistent with the
codepath where we generate textual patches.

 diff.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/diff.c b/diff.c
index 9038f19..302ef33 100644
--- a/diff.c
+++ b/diff.c
@@ -2251,6 +2251,8 @@ static void builtin_diffstat(const char *name_a, const char *name_b,
 		memset(&xpp, 0, sizeof(xpp));
 		memset(&xecfg, 0, sizeof(xecfg));
 		xpp.flags = o->xdl_opts;
+		xecfg.ctxlen = o->context;
+		xecfg.interhunkctxlen = o->interhunkcontext;
 		xdi_diff_outf(&mf1, &mf2, diffstat_consume, diffstat,
 			      &xpp, &xecfg);
 	}

^ permalink raw reply related

* Re: [PATCH v3] Docs: Clarify the --tags option of `git fetch'
From: Michael Witten @ 2011-09-22 17:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Anatol Pomozov, Drew Northup, Andrew Ardill, Daniel Johnson, git
In-Reply-To: <CAMOZ1BtBEHae7x-cn=DtnzwoyC_sYedgFmyNwrDuju+kcJU4hg@mail.gmail.com>

On Thu, Sep 22, 2011 at 17:35, Michael Witten <mfwitten@gmail.com> wrote:
> On Thu, Sep 22, 2011 at 17:10, Junio C Hamano <gitster@pobox.com> wrote:
>> Michael Witten <mfwitten@gmail.com> writes:
>>
>>> 8<-----------8<-----------8<-----------8<-----------8<-----------8<-----------
>>>
>>> See the discussion starting here:
>>>
>>>   [PATCH] Clarify that '--tags' fetches tags only
>>>   Message-ID: <1314997486-29996-1-git-send-email-anatol.pomozov@gmail.com>
>>>   http://thread.gmane.org/gmane.comp.version-control.git/180636
>>
>> It is a good practice to point to earlier discussions while polishing
>> patch, and it also is good to include pointers in the commit log message
>> as a supporting material (additional reading), but that is _NOT_ a
>> substitute for a properly written commit log message. You need to state
>> what problem you are trying to fix and how the proposed patch fixes it.
>>
>>>  Documentation/fetch-options.txt |   31 +++++++++++++++++++++++--------
>>>  1 files changed, 23 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
>>> index 39d326a..4cc5a80 100644
>>> --- a/Documentation/fetch-options.txt
>>> +++ b/Documentation/fetch-options.txt
>>> @@ -56,14 +56,29 @@ endif::git-pull[]
>>>  ifndef::git-pull[]
>>>  -t::
>>>  --tags::
>>> -     Most of the tags are fetched automatically as branch
>>> -     heads are downloaded, but tags that do not point at
>>> -     objects reachable from the branch heads that are being
>>> -     tracked will not be fetched by this mechanism.  This
>>> -     flag lets all tags and their associated objects be
>>> -     downloaded. The default behavior for a remote may be
>>> -     specified with the remote.<name>.tagopt setting. See
>>> -     linkgit:git-config[1].
>>> +     Most of a remote's tags are fetched automatically as branches are
>>> +     downloaded. However, git does not automatically fetch any tag that,
>>> +     when 'git fetch' completes, would not be reachable from any local
>>> +     branch head.  This option tells git to fetch all tags (and their
>>> +     associated objects).
>>
>> I would suggest clarifying the beginning of "git fetch --help" like the
>> attached patch. With that knowledge at hand, the readers do not need the
>> fuzzy "Most of ... are fetched" (leaving them wondering "what about the
>> rest, and how that Most is determined?"); we only need to say something
>> like "fetch all the tags from the remote and store them locally".
>>
>>  Documentation/git-fetch.txt |   21 ++++++++++-----------
>>  1 files changed, 10 insertions(+), 11 deletions(-)
>>
>> diff --git a/Documentation/git-fetch.txt b/Documentation/git-fetch.txt
>> index 60ac8d2..c6c7236 100644
>> --- a/Documentation/git-fetch.txt
>> +++ b/Documentation/git-fetch.txt
>> @@ -19,20 +19,19 @@ SYNOPSIS
>>
>>  DESCRIPTION
>>  -----------
>> -Fetches named heads or tags from one or more other repositories,
>> -along with the objects necessary to complete them.
>> +Fetches branches and tags (collectively known as 'refs') from one or more
>> +other repositories, along with the objects necessary to complete them.
>> +Which refs are fetched are determined by the <refspec> arguments, if
>> +given. Otherwise the default <refspec> configured for the <repository>
>> +are used (see "REMOTES" section below for how <refspec> works).
>>
>> -The ref names and their object names of fetched refs are stored
>> -in `.git/FETCH_HEAD`.  This information is left for a later merge
>> -operation done by 'git merge'.
>> +The ref names and their object names are also stored in `.git/FETCH_HEAD`.
>> +This information is used by 'git pull' that invokes this command.
>>
>>  When <refspec> stores the fetched result in remote-tracking branches,
>> -the tags that point at these branches are automatically
>> -followed.  This is done by first fetching from the remote using
>> -the given <refspec>s, and if the repository has objects that are
>> -pointed by remote tags that it does not yet have, then fetch
>> -those missing tags.  If the other end has tags that point at
>> -branches you are not interested in, you will not get them.
>> +the tags that point at commits on these branches are also fetched. Tags
>> +at the remote that point at commits that are not on these remote-tracking
>> +branches are not fetched by this mechanism (use `--tags` option to fetch them).
>>
>>  'git fetch' can fetch from either a single named repository,
>>  or from several repositories at once if <group> is given and
>>
>
> The only problem is that none of what you say seems to be true.
>
>  * The glossary very distinctly differentiates the
>    term `branch' from `branch head'.
>
>  * From skimming the code, it would seem that remote-tracking
>    branch [*heads*] are not at all the determining factor for
>    whether tags are automatically fetched. Rather, the
>    determining factor is much more relaxed: Tags are fetched
>    when a refspec's <dst> field is non-empty; it just so
>    happens that a <dst> is usually non-empty because at least
>    one remote-tracking branch [*head*] is being updated, but
>    keep in mind that the branch being updated need not be
>    considered a remote-tracking branch.

DAMNIT

That last bit should use the world `head':

  but keep in mind that the branch *head* being updated
  need not be considered a remote-tracking branch [*head*].

^ permalink raw reply

* Re: [PATCH v3] Docs: Clarify the --tags option of `git fetch'
From: Michael Witten @ 2011-09-22 17:35 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Anatol Pomozov, Drew Northup, Andrew Ardill, Daniel Johnson, git
In-Reply-To: <7v1uv8zen5.fsf@alter.siamese.dyndns.org>

On Thu, Sep 22, 2011 at 17:10, Junio C Hamano <gitster@pobox.com> wrote:
> Michael Witten <mfwitten@gmail.com> writes:
>
>> 8<-----------8<-----------8<-----------8<-----------8<-----------8<-----------
>>
>> See the discussion starting here:
>>
>>   [PATCH] Clarify that '--tags' fetches tags only
>>   Message-ID: <1314997486-29996-1-git-send-email-anatol.pomozov@gmail.com>
>>   http://thread.gmane.org/gmane.comp.version-control.git/180636
>
> It is a good practice to point to earlier discussions while polishing
> patch, and it also is good to include pointers in the commit log message
> as a supporting material (additional reading), but that is _NOT_ a
> substitute for a properly written commit log message. You need to state
> what problem you are trying to fix and how the proposed patch fixes it.
>
>>  Documentation/fetch-options.txt |   31 +++++++++++++++++++++++--------
>>  1 files changed, 23 insertions(+), 8 deletions(-)
>>
>> diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
>> index 39d326a..4cc5a80 100644
>> --- a/Documentation/fetch-options.txt
>> +++ b/Documentation/fetch-options.txt
>> @@ -56,14 +56,29 @@ endif::git-pull[]
>>  ifndef::git-pull[]
>>  -t::
>>  --tags::
>> -     Most of the tags are fetched automatically as branch
>> -     heads are downloaded, but tags that do not point at
>> -     objects reachable from the branch heads that are being
>> -     tracked will not be fetched by this mechanism.  This
>> -     flag lets all tags and their associated objects be
>> -     downloaded. The default behavior for a remote may be
>> -     specified with the remote.<name>.tagopt setting. See
>> -     linkgit:git-config[1].
>> +     Most of a remote's tags are fetched automatically as branches are
>> +     downloaded. However, git does not automatically fetch any tag that,
>> +     when 'git fetch' completes, would not be reachable from any local
>> +     branch head.  This option tells git to fetch all tags (and their
>> +     associated objects).
>
> I would suggest clarifying the beginning of "git fetch --help" like the
> attached patch. With that knowledge at hand, the readers do not need the
> fuzzy "Most of ... are fetched" (leaving them wondering "what about the
> rest, and how that Most is determined?"); we only need to say something
> like "fetch all the tags from the remote and store them locally".
>
>  Documentation/git-fetch.txt |   21 ++++++++++-----------
>  1 files changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/Documentation/git-fetch.txt b/Documentation/git-fetch.txt
> index 60ac8d2..c6c7236 100644
> --- a/Documentation/git-fetch.txt
> +++ b/Documentation/git-fetch.txt
> @@ -19,20 +19,19 @@ SYNOPSIS
>
>  DESCRIPTION
>  -----------
> -Fetches named heads or tags from one or more other repositories,
> -along with the objects necessary to complete them.
> +Fetches branches and tags (collectively known as 'refs') from one or more
> +other repositories, along with the objects necessary to complete them.
> +Which refs are fetched are determined by the <refspec> arguments, if
> +given. Otherwise the default <refspec> configured for the <repository>
> +are used (see "REMOTES" section below for how <refspec> works).
>
> -The ref names and their object names of fetched refs are stored
> -in `.git/FETCH_HEAD`.  This information is left for a later merge
> -operation done by 'git merge'.
> +The ref names and their object names are also stored in `.git/FETCH_HEAD`.
> +This information is used by 'git pull' that invokes this command.
>
>  When <refspec> stores the fetched result in remote-tracking branches,
> -the tags that point at these branches are automatically
> -followed.  This is done by first fetching from the remote using
> -the given <refspec>s, and if the repository has objects that are
> -pointed by remote tags that it does not yet have, then fetch
> -those missing tags.  If the other end has tags that point at
> -branches you are not interested in, you will not get them.
> +the tags that point at commits on these branches are also fetched. Tags
> +at the remote that point at commits that are not on these remote-tracking
> +branches are not fetched by this mechanism (use `--tags` option to fetch them).
>
>  'git fetch' can fetch from either a single named repository,
>  or from several repositories at once if <group> is given and
>

The only problem is that none of what you say seems to be true.

  * The glossary very distinctly differentiates the
    term `branch' from `branch head'.

  * From skimming the code, it would seem that remote-tracking
    branch [*heads*] are not at all the determining factor for
    whether tags are automatically fetched. Rather, the
    determining factor is much more relaxed: Tags are fetched
    when a refspec's <dst> field is non-empty; it just so
    happens that a <dst> is usually non-empty because at least
    one remote-tracking branch [*head*] is being updated, but
    keep in mind that the branch being updated need not be
    considered a remote-tracking branch.

^ permalink raw reply

* any way to "re-sync" a bare repository against another bare repository?
From: Chris Friesen @ 2011-09-22 17:22 UTC (permalink / raw)
  To: git

Suppose I have a parent bare repository.  I do a "git clone --bare" to 
create a child repository, and then clone the child to create a 
grandchild repository.

If changes get pushed into the parent repository, is there any way to 
cause the child to be updated?

Just a "git fetch <parent>" doesn't seem to help.  If I set up parent as 
a remote branch I can fetch it, but then it shows all the branches as 
"parent/<branch>" rather than updating the child.

I just tried a "git clone --mirror" to create the child and it seems to 
allow me to pick up changes in the parent via "git fetch".  Is that the 
proper way to handle this?

Thanks,
Chris

-- 
Chris Friesen
Software Developer
GENBAND
chris.friesen@genband.com
www.genband.com

^ permalink raw reply

* Re: Bug: git log --numstat counts wrong
From: Junio C Hamano @ 2011-09-22 17:32 UTC (permalink / raw)
  To: Alexander Pepper; +Cc: git, Tay Ray Chuan
In-Reply-To: <FAB0B05E-6BAD-488C-8478-F4B80493FB96@inf.fu-berlin.de>

Alexander Pepper <pepper@inf.fu-berlin.de> writes:

> When used git version 1.7.7.rc1 I didn't observed any case where git
> show and git log --numstat mismatch. I'm only a little confused, that
> 'git show' yields different results, depending on the git version.

In general it is not surprising nor unexpected--as long as both patches
describe the change correctly, they are both valid.

What was unexpected to me was that 27af01d (xdiff/xprepare: improve O(n*m)
performance in xdl_cleanup_records(), 2011-08-17) which was supposed to be
only about performance and not about logic made that difference.

^ permalink raw reply

* Re: How to use git attributes to configure server-side checks?
From: Junio C Hamano @ 2011-09-22 17:26 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git discussion list, Jay Soffian
In-Reply-To: <4E7AF1AE.5030005@alum.mit.edu>

Michael Haggerty <mhagger@alum.mit.edu> writes:

> I would like the checking configuration to be *versioned* along with the
> code.  For example, suppose my project decides to enforce a rule that
> all Python code needs to be indented with spaces.  It might be that not
> all of our old code adheres to this rule, and that we only want to clean
> up the code in master.

You want to sneak in a badly formatted code? Add an entry to the in-tree
attributes file to disable whitespace checking to cover that file!

So even though I agree with you that the check mechanism may need to be
aware of what revision it is checking and adjust which rules are applied
when checking the revision, I do not think using in-tree attribute file is
the right solution to that problem.

^ permalink raw reply

* Re: [PATCH] patch-id.c: use strbuf instead of a fixed buffer
From: Jeff King @ 2011-09-22 17:17 UTC (permalink / raw)
  To: Michael Schubert; +Cc: Junio C Hamano, Andrew Pimlott, git
In-Reply-To: <4E79DBAE.5090505@elegosoft.com>

On Wed, Sep 21, 2011 at 02:42:22PM +0200, Michael Schubert wrote:

> get_one_patchid() uses a rather dumb heuristic to determine if the
> passed buffer is part of the next commit. Whenever the first 40 bytes
> are a valid hexadecimal sha1 representation, get_one_patchid() returns
> next_sha1.
> Once the current line is longer than the fixed buffer, this will break
> (provided the additional bytes make a valid hexadecimal sha1). As a result
> patch-id returns incorrect results. Instead, user strbuf and read one
> line at a time.

A minor nit, but I think this is probably broken even if the additional
bytes don't look like a valid sha1. It would look like cruft after the
diff (since the lien doesn't start with plus, minus, or space), which
means it would not be stirred into the sha1 mix.

>  builtin/patch-id.c |   10 ++++++----
>  1 files changed, 6 insertions(+), 4 deletions(-)

The patch itself looks good to me, though. Thanks.

-Peff

^ permalink raw reply

* Re: How to use git attributes to configure server-side checks?
From: Jeff King @ 2011-09-22 17:13 UTC (permalink / raw)
  To: Jay Soffian; +Cc: Michael Haggerty, Junio C Hamano, git discussion list
In-Reply-To: <CAG+J_DxtCx6-RKWLKFy+V7tOtu7UnUrke7iN8gNdGiY-sC52sQ@mail.gmail.com>

On Thu, Sep 22, 2011 at 11:41:34AM -0400, Jay Soffian wrote:

> Thank you for this thread. I was under the illusion that diff-tree
> --check considered in-tree .gitattributes, but the code seems to
> indicate otherwise. :-(

No, we definitely don't use in-tree gitattributes. IIRC, there are some
precedence and ordering questions. I think the ordering now is:

  1. Look in $GIT_DIR/info/attributes

  2. If not found, look in per-directory .gitattributes

  3. If not found, look in core.gitattributesfile

Where do in-tree attributes fit in? Between 1 and 2? Or 2 and 3? And
which tree do we look at?

Here are some examples:

  a. If I do "git checkout branch-foo", we should look at branch-foo's
     tree for things like crlf, right?  Do we still fall back to
     per-directory .gitattributes in the working tree? On the one hand,
     they're not really relevant to the commit we're moving to. But they
     are respected in the current code, and can be useful when moving to
     old commits which lack attributes.

     I think this is where the index magic comes in in the current code
     (we do something like "load the index, then respect gitattributes
     from the index").

     So maybe this is solved already.

  b. You're diffing commit $a against commit $b. Whose gitattributes
     have precedence? Is order important? Are gitattributes in the
     working tree and index relevant?

  c. You're diffing commit $a against HEAD. Which gitattributes have
     precedence? Again, is order important (i.e., does "diff -R" look at
     different attributes than "diff")?

I'm sure there are others, too. And I don't think any of these is
insurmountable. But somebody needs to think through a lot of cases and
come up with consistent, sane behavior that does the right thing in each
case (considering both bare repositories and ones with working trees).

-Peff

^ permalink raw reply

* Re: [PATCH v3] Docs: Clarify the --tags option of `git fetch'
From: Junio C Hamano @ 2011-09-22 17:10 UTC (permalink / raw)
  To: Michael Witten
  Cc: Anatol Pomozov, Drew Northup, Andrew Ardill, Daniel Johnson, git
In-Reply-To: <686c38876d5a4ad6bfac67ca77fe9bb3-mfwitten@gmail.com>

Michael Witten <mfwitten@gmail.com> writes:

> 8<-----------8<-----------8<-----------8<-----------8<-----------8<-----------
>
> See the discussion starting here:
>
>   [PATCH] Clarify that '--tags' fetches tags only
>   Message-ID: <1314997486-29996-1-git-send-email-anatol.pomozov@gmail.com>
>   http://thread.gmane.org/gmane.comp.version-control.git/180636

It is a good practice to point to earlier discussions while polishing
patch, and it also is good to include pointers in the commit log message
as a supporting material (additional reading), but that is _NOT_ a
substitute for a properly written commit log message. You need to state
what problem you are trying to fix and how the proposed patch fixes it.

>  Documentation/fetch-options.txt |   31 +++++++++++++++++++++++--------
>  1 files changed, 23 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
> index 39d326a..4cc5a80 100644
> --- a/Documentation/fetch-options.txt
> +++ b/Documentation/fetch-options.txt
> @@ -56,14 +56,29 @@ endif::git-pull[]
>  ifndef::git-pull[]
>  -t::
>  --tags::
> -	Most of the tags are fetched automatically as branch
> -	heads are downloaded, but tags that do not point at
> -	objects reachable from the branch heads that are being
> -	tracked will not be fetched by this mechanism.  This
> -	flag lets all tags and their associated objects be
> -	downloaded. The default behavior for a remote may be
> -	specified with the remote.<name>.tagopt setting. See
> -	linkgit:git-config[1].
> +	Most of a remote's tags are fetched automatically as branches are
> +	downloaded. However, git does not automatically fetch any tag that,
> +	when 'git fetch' completes, would not be reachable from any local
> +	branch head.  This option tells git to fetch all tags (and their
> +	associated objects).

I would suggest clarifying the beginning of "git fetch --help" like the
attached patch. With that knowledge at hand, the readers do not need the
fuzzy "Most of ... are fetched" (leaving them wondering "what about the
rest, and how that Most is determined?"); we only need to say something
like "fetch all the tags from the remote and store them locally".

 Documentation/git-fetch.txt |   21 ++++++++++-----------
 1 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/Documentation/git-fetch.txt b/Documentation/git-fetch.txt
index 60ac8d2..c6c7236 100644
--- a/Documentation/git-fetch.txt
+++ b/Documentation/git-fetch.txt
@@ -19,20 +19,19 @@ SYNOPSIS
 
 DESCRIPTION
 -----------
-Fetches named heads or tags from one or more other repositories,
-along with the objects necessary to complete them.
+Fetches branches and tags (collectively known as 'refs') from one or more
+other repositories, along with the objects necessary to complete them.
+Which refs are fetched are determined by the <refspec> arguments, if
+given. Otherwise the default <refspec> configured for the <repository>
+are used (see "REMOTES" section below for how <refspec> works).
 
-The ref names and their object names of fetched refs are stored
-in `.git/FETCH_HEAD`.  This information is left for a later merge
-operation done by 'git merge'.
+The ref names and their object names are also stored in `.git/FETCH_HEAD`.
+This information is used by 'git pull' that invokes this command.
 
 When <refspec> stores the fetched result in remote-tracking branches,
-the tags that point at these branches are automatically
-followed.  This is done by first fetching from the remote using
-the given <refspec>s, and if the repository has objects that are
-pointed by remote tags that it does not yet have, then fetch
-those missing tags.  If the other end has tags that point at
-branches you are not interested in, you will not get them.
+the tags that point at commits on these branches are also fetched. Tags
+at the remote that point at commits that are not on these remote-tracking 
+branches are not fetched by this mechanism (use `--tags` option to fetch them).
 
 'git fetch' can fetch from either a single named repository,
 or from several repositories at once if <group> is given and

^ permalink raw reply related

* Re: What's cooking in git.git (Sep 2011, #06; Wed, 21)
From: Junio C Hamano @ 2011-09-22 16:39 UTC (permalink / raw)
  To: Michael Schubert; +Cc: Carlos Martín Nieto, git
In-Reply-To: <4E7AFC6C.7080603@elegosoft.com>

Michael Schubert <mschub@elegosoft.com> writes:

> On 09/22/2011 10:37 AM, Carlos Martín Nieto wrote:
>> On Wed, 2011-09-21 at 22:04 -0700, Junio C Hamano wrote:
>>> * cn/eradicate-working-copy (2011-09-21) 2 commits
>>>  - patch-id.c: use strbuf instead of a fixed buffer
>>>  - Remove 'working copy' from the documentation and C code
>> 
>> It looks like that first commit sneaked in there. Shouldn't that be its
>> own topic?
>
> It's in pu twice:

Thansk for noticing. Will remove.

> On 09/22/2011 07:04 AM, Junio C Hamano wrote: 
>> * cn/eradicate-working-copy (2011-09-21) 2 commits
>>  - patch-id.c: use strbuf instead of a fixed buffer
>>  - Remove 'working copy' from the documentation and C code
>
>> * ms/patch-id-with-overlong-line (2011-09-21) 1 commit
>>  - patch-id.c: use strbuf instead of a fixed buffer
>
> 64128da and a6c5c60
>
> There's also a minor typo in the last sentence of the commit message.
> Should I resend?

If that is s/user/use/ and nothing else, I can amend locally; otherwise
please do.

The last part of the "rather dumb heiristic" you talk about is designed to
parse "cmd | diff-tree -p --stdin" output. I agree with you that it can
safely tightened a bit by checking that 41st byte is the EOL, but it
probably is not worth it.

^ permalink raw reply

* Re: Bug: git log --numstat counts wrong
From: René Scharfe @ 2011-09-22 16:15 UTC (permalink / raw)
  To: git; +Cc: git
In-Reply-To: <7vobyd1vmo.fsf@alter.siamese.dyndns.org>

Am 21.09.2011 22:35, schrieb Junio C Hamano:
> Alexander Pepper <pepper@inf.fu-berlin.de> writes:
> 
>> Am 21.09.2011 um 14:24 schrieb Junio C Hamano:
>>>> $ git log --numstat 48a07e7e533f507228e8d1c99d4d48e175e14260
>>>> [...]
>>>> 11      10      src/java/voldemort/server/storage/StorageService.java
>>>
>>> Didn't we update it this already? I seem to get 10/9 here not 11/10.
>>
>> Current 'maint' (cd2b8ae9), 'master' (4b5eac7f)...
> 
> That's a tad old master you seem to have.
> 
> Strangely, bisection points at 27af01d5523, which was supposed to be only
> about performance and never about correctness. There is something fishy
> going on....

The patch below reverts a part of 27af01d5523 that's not explained in its
commit message and doesn't seem to contribute to the intended speedup.  It
seems to restore the original diff output.  I don't know how it's actually
doing that, though, as I haven't dug into the code at all.

Alexander, can you confirm that this patch restores the old behaviour of
git diff and git show for your test cases?

Ray, are you able to write a commit message for this patch if it turns out
to be useful?

René


diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index 5a33d1a..e419f4f 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -383,7 +383,7 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
  * might be potentially discarded if they happear in a run of discardable.
  */
 static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xdf2) {
-	long i, nm, nreff;
+	long i, nm, nreff, mlim;
 	xrecord_t **recs;
 	xdlclass_t *rcrec;
 	char *dis, *dis1, *dis2;
@@ -396,16 +396,20 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd
 	dis1 = dis;
 	dis2 = dis1 + xdf1->nrec + 1;
 
+	if ((mlim = xdl_bogosqrt(xdf1->nrec)) > XDL_MAX_EQLIMIT)
+		mlim = XDL_MAX_EQLIMIT;
 	for (i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart]; i <= xdf1->dend; i++, recs++) {
 		rcrec = cf->rcrecs[(*recs)->ha];
 		nm = rcrec ? rcrec->len2 : 0;
-		dis1[i] = (nm == 0) ? 0: 1;
+		dis1[i] = (nm == 0) ? 0: (nm >= mlim) ? 2: 1;
 	}
 
+	if ((mlim = xdl_bogosqrt(xdf2->nrec)) > XDL_MAX_EQLIMIT)
+		mlim = XDL_MAX_EQLIMIT;
 	for (i = xdf2->dstart, recs = &xdf2->recs[xdf2->dstart]; i <= xdf2->dend; i++, recs++) {
 		rcrec = cf->rcrecs[(*recs)->ha];
 		nm = rcrec ? rcrec->len1 : 0;
-		dis2[i] = (nm == 0) ? 0: 1;
+		dis2[i] = (nm == 0) ? 0: (nm >= mlim) ? 2: 1;
 	}
 
 	for (nreff = 0, i = xdf1->dstart, recs = &xdf1->recs[xdf1->dstart];

^ permalink raw reply related

* Re: How to use git attributes to configure server-side checks?
From: Jay Soffian @ 2011-09-22 15:41 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, git discussion list
In-Reply-To: <4E7AF1AE.5030005@alum.mit.edu>

On Thu, Sep 22, 2011 at 4:28 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> Where does "git show --check" read its gitattributes (i.e.,
> "whitespace") from?

It looks like builtin_checkdiff (diff.c) calls whitespace_rule (ws.c)
which in turn calls git_check_attr (attr.c) which, in a bare repo,
considers $(prefix)/etc/gitattributes, core.attributesfile and
$GIT_DIR/info/attributes, falling back to the default whitespace rule
of trailing_space, space_before_tab, and tab-width of 8
(WS_DEFAULT_RULE in cache.h).

Thank you for this thread. I was under the illusion that diff-tree
--check considered in-tree .gitattributes, but the code seems to
indicate otherwise. :-(

j.

^ permalink raw reply

* Re: [PATCH/RFCv4 2/4] gitweb: Add manpage for /etc/gitweb.conf (for gitweb documentation)
From: Jakub Narebski @ 2011-09-22 13:41 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Drew Northup
In-Reply-To: <20110920044118.GI6343@elie>

On Tue, 20 Sep 2011, Jonathan Nieder wrote:
> Jakub Narebski wrote:
> 
> > [jn: Improved, extended, removed duplicate info from README]
> 
> Hoorah!  I have very little to add to what's already been done ---
> mostly I can just lend my ear as a native English speaker.

Thanks, that is a lot of help; English is not my primary language.

> > +The syntax of the configuration files is that of Perl, as these files are
> > +indeed handled as fragments of Perl code (the language that gitweb itself is
> > +written in). Variables may be set using "`our $variable = value`"; text from
> > +"#" character until the end of a line is ignored. See the *perlsyn*(1) man
> > +page for more information.
> > +
> > +Actually using `our` qualifier in `our $variable = <value>;` is a safety
> > +check: if newer gitweb does no longer use given variable, by using `our` we
> > +won't get syntax errors.
> 
> Language nit: for "indeed", something like "internally" might be more
> natural.
> 
> The syntax and the typical idiom were already described in the
> DESCRIPTION section, so the emphasis should be somehow different here
> to avoid boring the reader.  One way might be to plunge directly into
> what the second paragraph above says:
> 
>  The syntax of the configuration files is that of Perl, since these files are
>  handled by sourcing them as fragments of Perl code (the language that
>  gitweb itself is written in). Variables are typically set using the
>  `our` qualifier (as in "`our $variable = <value>;`") to avoid syntax
>  errors if a new version of gitweb no longer uses a variable and therefore
>  stops declaring it.

Thanks, I used your proposal.  It reduces duplication, and reads better.
 
> > +
> > +You can include other configuration file using read_config_file()
> > +subroutine.  For example, you can read defaults in fallback
> > +system-wide GITWEB_CONFIG_SYSTEM from GITWEB_CONFIG by adding
> > +
> > +  read_config_file($GITWEB_CONFIG_SYSTEM);
> > +
> > +at very beginning of per-instance GITWEB_CONFIG file.  In this case
> > +settings in said per-instance file will override settings from
> > +system-wide configuration file.  Note that read_config_file checks
> > +itself that the file it reads exists.
[...]

> The reader wonders: After checking that the file exists, what does
> read_config_file do (does it silently skip it or error out)?  And why
> would I use this trick instead of just writing a gitweb-common.conf
> file?  (Was read_config_file introduced before gitweb-common.conf,
> making this useful when creating a configuration that should be usable
> with older versions of gitweb?)

I have replaced it now with the following:

  You can include other configuration file using read_config_file()
  subroutine.  For example, one might want to put gitweb configuration
  related to access control for viewing repositories via Gitolite (one
  of git repository management tools) in a separate file, e.g. in
  '/etc/gitweb-gitolite.conf'.  To include it, put
  
  --------------------------------------------------
  read_config_file("/etc/gitweb-gitolite.conf");
  --------------------------------------------------
  
  somewhere in gitweb configuration file used, e.g. in per-installation
  gitweb configuration file.  Note that read_config_file() checks itself
  that the file it reads exists, and does nothing if it is not found.
  It also handles errors in included file.

> > +$project_maxdepth::
> > +	Filesystem traversing depth limit for recursively scanning for git
> > +	repositories, used if `$projects_list` (below) is unset.  The default
> > +	value of this variable is determined by build-time configuration
> > +	variable `GITWEB_PROJECT_MAXDEPTH`, which defaults to 2007.
> 
> "Filesystem traversing depth limit" is quite a long stack of nouns :),
> and unfortunately I don't know what it means.  Is this the depth of
> nested subdirectories below $project_root that gitweb will look at
> when discovering repositories?  If I have no subdirectories in the
> projectroot except the repositories themselves, should I set this to 1
> or 0?  What happens with forks?  Is this just a convenience feature or
> can it be used for security or to create performance gaurantees?
> 
> By the way, what happens if projectroot contains a symlink to some
> directory elsewhere in the filesystem containing repositories --- will
> gitweb traverse it?

I have changed order of describing variables, putting $projects_list
before $project_maxdepth - this way we don't have to refer to following
text.

I have replaced the description of $project_maxdepth with the following

  $project_maxdepth::
  	If `$projects_list` variable is unset, gitweb will recursively
  	scan filesystem for git repositories.  The `$project_maxdepth`
  	is used to limit traversing depth, relative to `$projectroot`
  	(starting point); it means that directories which are further
  	from `$projectroot` than `$project_maxdepth` will be skipped.
  +
  It is purely performance optimization, originally intended for MacOS X,
  where recursive directory traversal is slow.  Gitweb follows symbolic
  links, but it detects cycles, ignoring any duplicate files and directories.
  +
  The default value of this variable is determined by the build-time
  configuration variable `GITWEB_PROJECT_MAXDEPTH`, which defaults to
  2007.
 
I hope that it makes meaning and use of this variable more clear.

> What if I want this to be infinite (i.e., to disable the feature) ---
> would I be crazy?

It is currently not possible to disable this feature.  The value of 2007
should be enough in most cases (read: all sane cases).
 
> > +$projects_list::
> > +	Plain text file listing projects or name of directory
> > +	to be scanned for projects.
[...]

> When this is a relative path, what is it taken relative to?

As with all relative paths to files, it is relative to where CGI script
is.  I probably should write about this somewhere...

> > +
> > +$export_ok::
> > +	Show repository only if this file exists (in repository).  Only
> > +	effective if this variable evaluates to true.  Can be set during
> > +	building gitweb via `GITWEB_EXPORT_OK`.  [No default / Not set]
> 
> This is always a relative path, right?  What is it relative to ---
> $GIT_DIR, I guess?
> 
> Usage nits: the phrase starting with "during" reads more naturally if
> "during" is replaced with "when" and "via" replaced with "by setting".
> If there were no default, that would mean that gitweb errors out when
> gitweb.conf does not set this variable; instead, the default seems to
> be "undef" (or 'false') if I understand correctly.

Description of $export_ok got replaced by the following:

  $export_ok::
  	Show repository only if this file exists (in repository).  Only
  	effective if this variable evaluates to true.  Can be set when
  	building gitweb by setting `GITWEB_EXPORT_OK`.  This path is
  	relative to `GIT_DIR`.  git-daemon[1] uses 'git-daemon-export-ok',
  	unless started with `--export-all`.  [No default / Not set
  	(this feature is turned off)]

> > +
> > +$export_auth_hook::
> > +	Show repository only if this subroutine returns true, when given as
> > +	the only parameter the full path to the project.  Example:
> > ++
> > +----------------------------------------------------------------------------
> > +our $export_auth_hook = sub { return -e "$_[0]/git-daemon-export-ok"; };
> > +----------------------------------------------------------------------------
> > ++
> > +though the above might be done by using `$export_ok` instead
> > ++
> > +----------------------------------------------------------------------------
> > +our $export_ok = "git-daemon-export-ok";
> > +----------------------------------------------------------------------------
> 
> Style nit: commands in this kind of documentation should be directed to
> the reader, not gitweb, so it would be nicer to explain this in terms of
> what the $export_auth_hook is.

Thanks.  I should probably rewrite description of $export_ok in the same
way, isn't it?

>                                 That is, something like the following: 
> 
>  $export_auth_hook::
> 	Function used to determine which repositories should be shown.
> 	This subroutine should take one parameter, the full path to
> 	a project, and if it returns true, that project will be included
> 	in the projects list and can be accessed through gitweb as long
> 	as it fulfills the other requirements described by $export_ok,
> 	$projects_list, and $projects_maxdepth.  Example:

Thanks, I have used your description.
 
> Is "our $export_auth_hook = undef;" a valid configuration?  What is the
> default?

I have added the following line at the end of description of this
variable:

  If not set (default), it means that this feature is disabled.

> > +
> > +$strict_export::
> > +	Only allow viewing of repositories also shown on the overview page.
> > +	This for example makes `$gitweb_export_ok` file decide if repository is
> > +	available and not only if it is shown.  If `$gitweb_list` points to
> > +	file with list of project, only those repositories listed would be
> > +	available for gitweb.  Can be set during building gitweb via
> > +	`GITWEB_STRICT_EXPORT`.  [No default / Not set]
> 
> Is the default behavior as though this were true or false?

Last sentence now reads:

  [No default / Not set (you can access	repositories hidden from projects
  list page)]

> > +Finding files
> > +~~~~~~~~~~~~~
> > +Those configuration variables tell gitweb where to find files.  Values of
> > +those variables are paths on filesystem.  Of those only `$GIT` is required
> > +to be (correctly) set for gitweb to be able to work; all the rest are
> > +required only for extra configuration or extra features.
[...]
> When you say the remainder are only required in special cases, does
> that mean that they are ignored unless some other enabling variable is
> set or that they can be set to "undef" to disable the relevant
> feature?

I have removed this sequence, and extended description of each variable
instead.
 
> > +$mimetypes_file::
> > +	File to use for (filename extension based) guessing of MIME types before
> > +	trying '/etc/mime.types'.  Path, if relative, is taken currently as
> > +	relative to the current git repository.  [Unset by default]

> The use of "currently" sounds like an apology.  Does that mean that
> gitweb ought to be rejecting relative paths for this variable?

I think it was originally intended as note that this resolution of relative
path is subject to change in future versions of gitweb.  But I think it
is not true (backward compatibility), and it is more important that this
resolution of relative path is untypical.

I have reformulated this description to read:

  $mimetypes_file::
  	File to use for (filename extension based) guessing of MIME types before
  	trying '/etc/mime.types'.  *NOTE* that this path, if relative, is taken
  	as relative to the current git repository, not to CGI script.  If unset,
  	only '/etc/mime.types' is used (if present on filesystem).  If no mimetypes
  	file is found, mimetype guessing bassed on extension of file is disabled.
  	[Unset by default]
 
> > +
> > +$highlight_bin::
> > +	Path to the highlight executable to use (must be the one from
> > +	http://www.andre-simon.de due to assumptions about parameters and output).
> > +	Useful if 'highlight' is not installed on your webserver's PATH.
> > +        [Default: 'highlight']
> > ++
> > +*NOTE*: if you want to add support for new syntax (supported by
> > +"highlight" but not used by gitweb), you need to modify `%highlight_ext`
> > +or `%highlight_basename`, depending on whether you detect type of file
> > +based on extension (for example "*.sh") or on its basename (for example
> > +"Makefile").  Keys of those hashes are extension or basename,
> > +respectively, and value for given key is name of syntax to be passed via
> > +`--syntax <syntax>` to highlighter.
[...]

> Is "*.sh" actually an example of an extension?  I.e., do I write the
> following?
> 
> 	our %highlight_ext;
> 	$highlight_ext{"*.sh"} = "sh";

You are right, it should be "sh", not "*.sh".  Fixed, and added example
(about "phtml", which usually means PHP, but might mean embedded Perl
(ePerl)).

> > +Links and their targets
> > +~~~~~~~~~~~~~~~~~~~~~~~
> > +Configuration variables described below configure some of gitweb links:
> > +their target and their look (text or image), and where to find page
> > +prerequisites (stylesheet, favicon, images, scripts).  Usually they are left
> > +at their default values, with the possible exception of `@stylesheets`
> > +variable.
> 
> Language nits: missing "The" before "Configuration variables described
> below"; missing "some" before "supporting files".
                                ^^^^^^^^^^^^^^^^^^---???  
 
> > +
> > +@stylesheets::
> > +	List of URIs of stylesheets (relative to base URI of a page). You
> > +	might specify more than one stylesheet, for example use gitweb.css
> > +	as base, with site specific modifications in separate stylesheet
> > +	to make it easier to upgrade gitweb. You can add a `site` stylesheet
> > +	for example by putting
> > ++
> > +----------------------------------------------------------------------------
> > +push @stylesheets, "gitweb-site.css";
> > +----------------------------------------------------------------------------
> > ++
> > +in the gitweb config file.  Those values that are relative paths, are
> > +relative to base URI of gitweb.
> 
> [...]  What is a base URI --- is it set by another variable?

Base URI is a concept defined in HTML / XHTML specification (or in URI
definition), so I'd rather not explain it here.


> > +This list should contain URI of gitweb's stylesheet.  The URI of gitweb
> > +stylesheet is set during build time via `GITWEB_CSS` variable.  It's default
> > +value is 'static/gitweb.css' (or 'static/gitweb.min.css' if the `CSSMIN`
> > +variable is defined, i.e. if CSS minifier is used during build).
> > ++
> > +*Note*: there is also legacy `$stylesheet` configuration variable, which was
> > +used by older gitweb.
[...]
> If I am new on the job and find the $stylesheet variable set, what
> should I interpret it to mean?  How can it be translated to the newer
> @stylesheets style?  What happens if both variables are set --- does
> one override the other, or are they combined somehow?

Added

  If `$stylesheet` variable is defined, only CSS stylesheet
  given by this variable is used by gitweb.

I hope that old config files that define this variable are rare.

> > +$home_link::
> > +	Target of the home link on top of all pages (the first part of view
> > +	"breadcrumbs").  By default set to absolute URI of a page ($my_uri).

> Which page does "a page" refer to here (doesn't $my_uri change from
> request to request)?

I have extended last sequence of this description to read:

  By default it is set to the absolute URI of a current page
  (to the value of `$my_uri` variable, or to "/" if `$my_uri` is undefined
  or is an empty string).
 
$my_uri changes from request to request, and $home_link too.

By default config is re-read on each request; if not, one should
use $per_request_config subroutine to set this variable if it changes
from request to request.

> > +$default_projects_order::
> > +	Default value of ordering of projects on projects list page.  Valid
> > +	values are "none" (unsorted), "project" (by project name, i.e. path
> > +	to repository relative to `$projectroot`), "descr" (project
> > +	description), "owner", and "age" (by date of most current commit).
> > ++
> > +Default value is "project".  Unknown value means unsorted.
> 
> What does "Default" mean here (i.e., what overrides it)?

Added

  [...], which means the ordering used if you don't explicitly sort
  projects list (if there is no "o" CGI query parameter in the URL).
 
> > +
> > +
> > +Changing gitweb behavior
> > +~~~~~~~~~~~~~~~~~~~~~~~~
> > +Those configuration variables control _internal_ gitweb behavior.
> 
> [...] Does internal behavior mean "functionality as opposed to
> appearance and input location" or something like that?

Yes, it does.

> > +
> > +$default_blob_plain_mimetype::
> > +	Default mimetype for blob_plain (raw) view, if mimetype checking
> > +	doesn't result in some other type; by default "text/plain".
[...]

> Where can I look to read more about mimetype autodetection?

Added the following sentence:

  	Gitweb guesses mimetype of a file to display based on extension
  	of its filename, using `$mimetypes_file` (if set and file exists)
  	and '/etc/mime.types' files (see *mime.types*(5) manpage; only
  	filename extension rules are supported by gitweb).
 
> > +@diff_opts::
> > +	Rename detection options for git-diff and git-diff-tree. By default
> > +	(\'-M'); set it to (\'-C') or (\'-C', \'-C') to also detect copies,
> > +	or set it to () i.e. empty list if you don't want to have renames
> > +	detection.
[...]

> Probably worth mentioning that GNU patch still has problems with some
> rename patches, especially when they involve file copies ['-C'] or
> criss-cross renames ['-B'] (see [1] and [2], for example).
> 
> [1] http://savannah.gnu.org/bugs/?31058
> [2] http://thread.gmane.org/gmane.linux.ports.sh.devel/8697/focus=8773

I have added the following paragraph:

  +
  *Note* that rename and especially copy detection can be quite
  CPU-intensive.  Note also that non git tools can have problems with
  patches generated with options mentioned above, especially when they
  involve file copies (\'-C') or criss-cross renames (\'-B').


> > +@git_base_url_list::
> > +	List of git base URLs used for URL to where fetch project from, shown
> > +	in project summary page.  Full URL is "`$git_base_url/$project`".  You
> > +	can setup multiple base URLs (for example one for `git://` protocol
> > +	access, and one for `http://` "dumb" protocol access).  Note that per
> > +	repository configuration can be set in '$GIT_DIR/cloneurl' file, or as
> > +	values of multi-value `gitweb.url` configuration variable in project
> > +	config.
> > ++
> > +You can setup one single value (single entry/item in this list) during
> > +building by using `GITWEB_BASE_URL` built-time configuration variable.
> > +[No default]
> 
> Language nits: "git base URLs used for URL to where fetch project from"
> means something like "Git URLs relative to which projects can be
> fetched"; [...]

[...]
> Is the per repository cloneurl added to this list, or does it override
> it?

I have rewritten it to hopefully more clear version, and added information
about precedence of cloneurl and the like:

  @git_base_url_list::
  	List of git base URLs.  These URLs are used to generate URLs
  	describing from where to fetch a project, which are shown on
  	project summary page.  The full fetch URL is "`$git_base_url/$project`",
  	for each element of this list. You can set up multiple base URLs
  	(for example one for `git://` protocol, and one for `http://`
  	protocol).
  +
  Note that per repository configuration can be set in '$GIT_DIR/cloneurl'
  file, or as values of multi-value `gitweb.url` configuration variable in
  project config.  Per-repository configuration takes precedence over value
  composed from `@git_base_url_list` elements and project name.
  +
  You can setup one single value (single entry/item in this list) at build
  time by setting the `GITWEB_BASE_URL` built-time configuration variable.
  [Default: (), i.e. empty list]

> > +$maxload::
> > +	Used to set the maximum load that we will still respond to gitweb queries.
> > +	If server load exceed this value then return "503 Service Unavailable"
> > +	error. Server load is taken to be 0 if gitweb cannot determine its value.
> > +	Set it to undefined value to turn it off. [Default: 300]
> 
> Probably worth spelling out that this is a "load average" (as shown by uptime(1)).

I have added the following sentence to describe it:

  Currently it works only on Linux,
  where it uses '/proc/loadavg'; the load there is the number of active
  tasks on the system -- processes that are actually running -- averaged
  over the last minute.
 

> > +$per_request_config::
> > +	If set to code reference, it would be run once per each request.  You can
> > +	set parts of configuration that change per session, e.g. by adding
> > +	the following code to gitweb configuration file
> > ++
> > +--------------------------------------------------------------------------------
> > +our $per_request_config = sub {
> > +	$ENV{GL_USER} = $cgi->remote_user || "gitweb";
> > +};
> > +--------------------------------------------------------------------------------
> 
> Language nits: missing "this is" before "set to a code reference";
> "would" should be "will" (those crazy English conditionals again); the
> appropriate preposition before "each request" is "for", not "per";
> missing "this way" after "change per session"; would be clearer with a
> full stop before the example, and with the example made into a full
> sentence ("For example, one might use the following code in a gitweb
> configuration file:").
> 
> > ++
> > +Otherwise it is treated as boolean value: if true gitweb would process
> > +config file once per request, if false it would process config file only
> > +once.  [Default: true]
> > ++
> > +*NOTE*: `$my_url`, `$my_uri`, and `$base_url` are overwritten with their default
> > +values before every request, so if you want to change them, be sure to set
> > +this variable to true or a code reference effecting the desired changes.
> 
> At this point, I've forgotten what the "Otherwise" is contrasting with;
> probably best to repeat it ("If $per_request_config is not a code
> reference, it is interpreted as a boolean value."); "would" should be
> "will" again (2x); missing "the" or "each" before "config file" (2x);
> missing "and" before "if it is false"; "it" could be "gitweb" to
> avoid confusion in this pronoun-heavy sentence; missing "each time it
> is executed" after "process the config file only once".
> 
> Probably worth mentioning that this variable starts to shine when one
> gitweb instance is used to serve multiple requests, with CGI
> replacements like mod_perl, fastcgi, plackup, and so on.

I have rewritten and extended it as following:

  If `$per_request_config` is not a code reference, it is interpreted as boolean
  value.  If it is true gitweb will process config files once per request,
  and if it is false gitweb will process config files only once, each time it
  is executed.  [Default: true]
  +
  *NOTE*: `$my_url`, `$my_uri`, and `$base_url` are overwritten with their default
  values before every request, so if you want to change them, be sure to set
  this variable to true or a code reference effecting the desired changes.
  +
  This variable matters only when using persistent web environments that
  serve multiple requests using single gitweb instance, like mod_perl,
  FastCGI or Plackup.

> > +$version::
> > +	Gitweb version, set automatically when creating gitweb.cgi from
> > +	gitweb.perl. You might want to modify it if you are running modified
> > +	gitweb, for example 
> > ++
> > +---------------------------------------------------
> > +our $version .= " with caching";
> > +---------------------------------------------------
> 
> Might be worth mentioning this is only used in HTML comments and the
> "generator" metadata field, nothing more special than that.

Added:

  This variable is purely informational, used e.g. in the "generator"
  meta header in HTML header.
 
> > +
> > +$my_url::
> > +$my_uri::
> > +	Full URL and absolute URL of gitweb script;
> > +	in earlier versions of gitweb you might have need to set those
> > +	variables, now there should be no need to do it.  See
> > +	`$per_request_config` if you need to set them still.
[...]

> What is the difference between a full URL and an absolute URL?

Full URL is e.g. 'http://git.example.com/gitweb.cgi', while absolute URL
is e.g. '/gitweb.cgi'.

> > +sub::
> > +	Subroutine that will be called with default options as parameters to
> > +	examine per-repository configuration, but only if feature is
> > +	overridable (\'override' is set to true).  If not present then
> > +	per-repository override for given feature is not supported.
> > ++
> > +You wouldn't need to ever change it in gitweb config file.
> 
> Language nits: "default options as parameters" means "the array
> referred to by the 'default' field as parameter list", I guess.
> Missing "the" before "feature is overridable" and "gitweb
> configuration files".  "given feature" -> "this feature".  "not
> supported" -> "not enabled", maybe.
> 
> Actually, I'm not sure I understand this one.  Are users supposed to
> set this field?  Where can they look to find out what features have it
> set by default?  Who calls this function, and what does that person or
> code path do with the return value?  If it is not part of the gitweb
> configuration that an administrator might modify (the subject of this
> manpage), why not just say that ("subroutine used internally; present
> if and only if this feature can be made overridable with the
> "override" field; you should not change it") and leave it at that?

Changed this to:

  sub::
  	Internal detail of implementation.  What is important is that
  	if this field is not present then per-repository override for
  	given feature is not supported.
  +
  You wouldn't need to ever change it in gitweb config file.
 
> > +
> > +
> > +Features in `%feature`
> > +~~~~~~~~~~~~~~~~~~~~~~
> > +Below there are described gitweb features that are configurable via
> > +`%feature` hash.  For a complete list please consult gitweb sources.
> 
> Language nit: "Below there are described gitweb features ..." -> "The
> gitweb features ... are listed below".  The second sentence could
> probably be more apologetic, something like "Currently the only
> authoritative and complete list is in the comments in the gitweb.cgi
> source code."

Reformulated it to read:

  The gitweb features that are configurable via `%feature` hash are listed
  below.  This should be a complete list, but ultimately the authoritative
  and complete list is in gitweb.cgi source code, with features described
  in the comments.

> > +pickaxe::
> > +	Enable the so called pickaxe search, which will list the commits
> > +	that modified a given string in a file.  This can be practical and
> > +	quite faster alternative to "blame" action, but still potentially
> > +	CPU-intensive.  Enabled by default.
> > ++
> > +The pickaxe search is described in linkgit:git-log[1] (the
> > +description of `-S<string>` option, which refers to pickaxe entry in
> > +linkgit:gitdiffcore[7] for more details).
> > ++
> > +This feature can be configured on a per-repository basis via
> > +repository's `gitweb.pickaxe` configuration variable (boolean).
> 
> I think "modified" means "introduced or removed" (as an approximation to
> "changes the number of occurences of").

It is even more correct with respect to mechanism behind '-S' option.

> (By the way, is there any way for the the very paranoid to limit the
> length of regexes used or CPU time used per request, to get some
> reasonable cap on this sort of thing?  I guess that's more in the
> realm of web server configuration than something gitweb should be
> responsible for --- but if there's some common practice or well known
> reference on the topic, it could be worth mentioning at the top of
> this GITWEB FEATURES section some day.)

I don't know if it is possible to limit CPU time used per request.
I think most of time is spent in git commands; gitweb currently doesn't
offer any way to renice or limit time of running git commands.  The
only load-related mechanism is $maxload.

> > +show-sizes::
> > +	Enable showing size of blobs (ordinary files) in a "tree" view, in a
> > +	separate column, similar to what `ls -l` does; see description of
> > +	`-l` option in linkgit:git-ls-tree[1] manpage.  This cost a bit of
> > +	I/O.  Enabled by default.
> > ++
> > +This feature can be configured on a per-repository basis via
> > +repository's `gitweb.showsizes` configuration variable (boolean).
> 
> Nit: "cost" -> "costs".  Interesting (I guess packv4 will help with
> the CPU [decompression] cost of this when it comes).

Nb. the fast that directory entries ('tree' objects) do not have their
size displayed was result of request from people working on packv4, where
'tree' objects would be stored decomposed, and calculating object size
for a 'tree' object will be quite costly.

Add to that the fact that 'tree' object size is not very interesting...
 
> > +avatar::
> > +	Avatar support.  When this feature is enabled, views such as
> > +	"shortlog" or "commit" will display an avatar associated with
> > +	the email of the committer(s) and/or author(s).
> > ++
> > +Currently available providers are *"gravatar"* and *"picon"*.
> > +If an unknown provider is specified, the feature is disabled.
> > +*Note* that some provides might require extra Perl packages to be
> > +installed; see 'gitweb/INSTALL' for more details.
[...]

> If I set 'default' to ["gravator", "picon", "unknown"], will that
> really disable the feature?

Only one provider at a time can be selected; I have added this
information to description.
 

> > +forks::
> > +	Make gitweb consider projects in subdirectories of project root
> > +	(basename) to be forks of existing projects.  Given project
> > +	`$projname.git`, projects matching `$projname/*.git` will not be
> > +	shown in the main projects list, instead a \'+' mark will be added
> > +	to `$projname` there and a "forks" view will be enabled for the
> > +	project, listing all the forks.  If project list is taken from a
> > +	file, forks have to be listed after the main project.
> > ++
> > +Project specific override is not supported.
[...]

> If I understand correctly, "Given project `$projname.git`, projects
> matching `$projname/*.git`" can be written as "If the project
> `$projname.git` exists, projects in the `$projname/` directory".  Is
> it just projects in that directory, or are subdirectories included, as
> in the following example?
> 
> 	project.git
> 	project/foo/bar/baz/qux.git

Yes, they are.

The new improved and extended version is now:

  forks::
  	If this feature is enabled, gitweb considers projects in
  	subdirectories of project root (basename) to be forks of existing
  	projects.  For each project `$projname.git`, projects in the
  	`$projname/` directory and its subdirectories will not be
  	shown in the main projects list.  Instead, a \'+' mark is shown
  	next to `$projname`, which links to a "forks" view that lists all
  	the forks (all projects in `$projname/` subdirectory).  Additionally
  	a "forks" view for a project is linked from project summary page.
  +
  If the project list is taken from a file (`$projects_list` points to a
  file), forks are only recognized if they are listed after the main project
  in that file.
  +
  Project specific override is not supported.

> > +
> > +actions::
> > +	Insert custom links to the action bar of all project pages.  This
> > +	enables you to link to third-party scripts integrating into gitweb.
> > ++
> > +The "default" value consists of a list of triplets in the form
> > +`("<label>", "<link>", "<position>")` where "position" is the label
> > +after which to insert the link, "link" is a format string where `%n`
> > +expands to the project name, `%f` to the project path within the
> > +filesystem, `%h` to the current hash (\'h' gitweb parameter) and
> > +`%b` to the current hash base (\'hb' gitweb parameter); `%%` expands
> > +to \'%'.
> 
> Interesting.  "enables" should be "allows" here.

Right.
 
> Is %f an absolute path (i.e., starting with '/')?
 
It is "$projectroot/$project" (which I have added to this description).

> > +In most case you would want to change only default timezone:
> > ++
> > +---------------------------------------------------------------------------
> > +$feature{'javascript-timezone'}{'default'}[0] = "utc";
> > +---------------------------------------------------------------------------
[...]
> > +Timezone value can be "local" (for local timezone that browser uses), "utc"
> > +(what gitweb uses when JavaScript or this feature is disabled), or numerical
> > +timezone in the form of "+/-HHMM" for example "+0200".  This way is
> > +guaranteed to be backward compatibile.
> > ++
> > +Project specific override is not supported.
> 
> Language nits: when we are talking about supported values in general,
> it sounds better for some reason with "value" -> values"; "numerical
> timezone" -> "numerical timezones"; "for example" -> "such as".
> 
> What is "This way", and what other way should people be avoiding to
> prevent forward compatibility gotchas?
 
"This way" refers to the example configuration.  I have reformulated
this description by removing last sentence, and adding

  +
  The example configuration presented here is guaranteed to be backwards
  and forward compatible.

just after example.


> [...]
> > +GITWEB_CONFIG::
> > +	Sets location of per-instance configuration file.
> [...]
> > +FILES
> > +-----
> > +gitweb_config.perl::
> > +	This is default value for per-instance configuration file.  The
> > +	format of this file is described above.
> > +/etc/gitweb.conf::
> > +	This is default value for fallback system-wide configuration
> > +	file.  This file is used only if per-instance configuration
> > +	variable is not found.
> > +/etc/gitweb-common.conf::
> > +	This is default value for common system-wide configuration
> > +	file.
> 
> "default value" sounds strange here --- I guess I would have expected
> something more like
> 
>  ENVIRONMENT
>  -----------
>  ...
>  GITWEB_CONFIG::
> 	Path to use to find the per-instance configuration file,
> 	instead of gitweb_config.perl.  If relative, this is relative
> 	to $GITWEBDIR.
> 
>  FILES
>  -----
>  /etc/gitweb-common.conf::
> 	Options to be shared by all gitweb instances.  The format is
> 	described above.
>  $GITWEBDIR/gitweb_config.perl::
> 	Additional settings for a particular gitweb instance (in
> 	the same format).
>  /etc/gitweb.conf::
> 	Fallback configuration file with settings for gitweb instances
> 	that do not contain a gitweb_config.perl file.

I'll think about this proposal.

For now I have just changed "default value for ..." to
"default name of ...".
  

Thank you very much for your feedback.

-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: Bug: git log --numstat counts wrong
From: Alexander Pepper @ 2011-09-22 13:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Tay Ray Chuan
In-Reply-To: <7vobyd1vmo.fsf@alter.siamese.dyndns.org>

Am 21.09.2011 um 22:35 schrieb Junio C Hamano:
> That's a tad old master you seem to have.
Up until now I only followed the github clone, but that seems to be dated. Since kernel.org is still down, I'll now follow the google code clone.


Am 21.09.2011 um 22:35 schrieb Junio C Hamano:
>> Am 21.09.2011 um 14:24 schrieb Junio C Hamano:
>>>> $ git log --numstat 48a07e7e533f507228e8d1c99d4d48e175e14260
>>>> [...]
>>>> 11      10      src/java/voldemort/server/storage/StorageService.java
>>> 
>>> Didn't we update it this already? I seem to get 10/9 here not 11/10.
>> 
>> Current 'maint' (cd2b8ae9), 'master' (4b5eac7f)...
> 
> Strangely, bisection points at 27af01d5523, which was supposed to be only
> about performance and never about correctness. There is something fishy
> going on....
I also did some tests, and besides --numstat also git show sometimes show different patches in comparison to older git versions. The last version with the "old" git show output is 1.7.7.rc0 and the first version with the "new" git show output is 1.7.7.rc1.

with git version 1.7.7.rc0:
$ git show 679e1d5cd007a0a9cb2813bd155622d7a1e904bd :
[...]
diff --git a/src/java/voldemort/store/stats/RequestCounter.java b/src/java/voldemort/store/stats/RequestCounter.java
index b012e98..c6be603 100644
--- a/src/java/voldemort/store/stats/RequestCounter.java
+++ b/src/java/voldemort/store/stats/RequestCounter.java
@@ -64,16 +64,21 @@ public class RequestCounter {
         Accumulator accum = values.get();
         long now = System.currentTimeMillis();
 
-        if(now - accum.startTimeMS > durationMS) {
-            Accumulator newWithTotal = accum.newWithTotal();
-            values.set(newWithTotal);
-
-            /*
-             * try to set.  if we fail, then someone else set it, so just keep going
-             */
-            if(values.compareAndSet(accum, newWithTotal)) {
-                return newWithTotal;
-            }
+        /*
+         *  if still in the window, just return it
+         */
+        if(now - accum.startTimeMS <= durationMS) {
+            return accum;
+        }
+
+        /*
+         * try to set.  if we fail, then someone else set it, so just return that new one
+         */
+
+        Accumulator newWithTotal = accum.newWithTotal();
+
+        if(values.compareAndSet(accum, newWithTotal)) {
+            return newWithTotal;
         }
 
         return values.get();


with git version 1.7.7.rc1:
$ git show 679e1d5cd007a0a9cb2813bd155622d7a1e904bd
[...]
diff --git a/src/java/voldemort/store/stats/RequestCounter.java b/src/java/voldemort/store/stats/RequestCounter.java
index b012e98..c6be603 100644
--- a/src/java/voldemort/store/stats/RequestCounter.java
+++ b/src/java/voldemort/store/stats/RequestCounter.java
@@ -64,16 +64,21 @@ public class RequestCounter {
         Accumulator accum = values.get();
         long now = System.currentTimeMillis();
 
-        if(now - accum.startTimeMS > durationMS) {
-            Accumulator newWithTotal = accum.newWithTotal();
-            values.set(newWithTotal);
+        /*
+         *  if still in the window, just return it
+         */
+        if(now - accum.startTimeMS <= durationMS) {
+            return accum;
+        }
 
-            /*
-             * try to set.  if we fail, then someone else set it, so just keep going
-             */
-            if(values.compareAndSet(accum, newWithTotal)) {
-                return newWithTotal;
-            }
+        /*
+         * try to set.  if we fail, then someone else set it, so just return that new one
+         */
+
+        Accumulator newWithTotal = accum.newWithTotal();
+
+        if(values.compareAndSet(accum, newWithTotal)) {
+            return newWithTotal;
         }
 
         return values.get();


The difference is, that now it's shown as two delete and two added hunks instead of one bigger delete and one bigger added hunk.

with git version 1.7.7.rc1:
$ git show 679e1d5cd007a0a9cb2813bd155622d7a1e904bd | diffstat
 RequestCounter.java |   23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

with git version 1.7.7.rc0:
$ git show 679e1d5cd007a0a9cb2813bd155622d7a1e904bd | diffstat
 RequestCounter.java |   25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

When used git version 1.7.7.rc1 I didn't observed any case where git show and git log --numstat mismatch. I'm only a little confused, that 'git show' yields different results, depending on the git version.

Greetings from Berlin
Alex

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox