From: Jeff King <peff@peff.net>
To: Jay Soffian <jaysoffian@gmail.com>
Cc: Michael Haggerty <mhagger@alum.mit.edu>,
Junio C Hamano <gitster@pobox.com>,
git discussion list <git@vger.kernel.org>
Subject: Re: How to use git attributes to configure server-side checks?
Date: Thu, 22 Sep 2011 16:58:56 -0400 [thread overview]
Message-ID: <20110922205856.GA8563@sigill.intra.peff.net> (raw)
In-Reply-To: <CAG+J_DxdP2qHhttJOtWQTKeiDV2YbC_A_F+b9sDOZsWhWxjcjw@mail.gmail.com>
On Thu, Sep 22, 2011 at 02:41:42PM -0400, Jay Soffian wrote:
> attr.c says:
>
> a. built-in attributes
> b. $(prefix)/etc/gitattributes unless GIT_ATTR_NOSYSTEM is set
> c. core.attributesfile
> d. per-directory .gitattributes
> e. $GIT_DIR/info/attributes
>
> The mechanics of (d) are established by git_attr_direction:
>
> GIT_ATTR_CHECKIN: working-copy, then index
> GIT_ATTR_CHECKOUT: index, then working-copy
> GIT_ATTR_INDEX: index-only
Thanks for hunting it down. I had been thinking that "attributes from
tree" would come either before or after (d) above, but the concept
really fits better into the second list (i.e., they're per-directory
attributes, it's just a matter of which set of directories).
> Where GIT_ATTR_CHECKIN is the default direction and GIT_ATTR_CHECKOUT
> is used while checking-out files from the index. (GIT_ATTR_INDEX is
> used only by git-archive.)
Hrm. But git archive works correctly in a bare repo with no index at
all. It looks like we just populate an in-core index and feed that to
git_attr_set_direction. Which is a bit suboptimal for something like
diff, which might not need to look at all parts of the tree (I guess it
is similarly suboptimal for archive with pathspecs).
But still, those are just performance considerations. We could use the
same trick for diff/log in a bare repo. I guess we'd end up refreshing
the index from each commit in a multi-commit log, which might be
noticeably slow.
> Consistent with that, when comparing two commits (diff-tree), I think
> you look at the .gitattributes in the second commit.
That makes some sense to me. As Junio pointed out, there is a catch with
"diff -R". In that case, I would still think you would use the "second"
commit, even though we're reversing the diff. So:
git diff A B
would not be exactly equivalent to:
git diff -R B A
in that the second would use attributes from "A" instead of "B".
However, I think this is skirting around a somewhat larger issue, which
is that gitattributes are sometimes about "this is what the file is like
at the current state of history", and sometimes about "this is what this
file is like throughout history".
For example, imagine I've got a repository of binary files. At some
point, I decide to use gitattributes to configure a textconv filter. In
my non-bare repo, when I do "git log" I get nice textconv'd diffs going
back through all of history. But if I push to a bare repo and try to do
a "git log" there, my attributes are ignored. So in this case, I like
that the working tree's attributes are applied retroactively, and I'd
like the bare repo to do the same.
Now consider a different example. I have a repo with a script "foo" that
contains perl code, and I add .gitattributes marking it as such (for
func headers, or maybe I have a magic external diff, or whatever[1]).
Later, I decide that I hate perl and rewrite it in python, updating
gitattributes. Now I _don't_ want the retroactive behavior. When I do a
"git log -p", I want to see the old commits with the perl script shown
using the perl attribute, not the python. But what the heck would it
mean to diff a recent python commit against an old perl one?
Even though we think of the diff attributes as "here's how you diff",
they are really "here are some annotations about the end points". So for
something like a textconv, you care about the attributes of _both_
sides of a diff, applying the attributes of the "from" tree to the paths
in that tree. Something like funcname is harder. I guess you'd probably
want to use the attributes from the "from" tree, since it is about
reporting context from the "from" side.
So the semantics really end up depending on the particular attribute.
Something like "-delta" exists more outside the history or the state of
a particular commit.
So I think doing it "right" would be a lot more complicated than just
reading the commits from a particular tree. I think it would mean adding
more context to all attribute lookups, and having each caller decide how
the results should be used.
However, retroactively applying attributes from the working tree, even
though it is sometimes wrong (e.g., we get the perl/python thing wrong
_now_ if you have a working tree), is often convenient in practice
(because otherwise you end up duplicating your per-directory
gitattributes in your info/attributes file), and rarely is it actually
wrong (because changing the type of a file without changing its path
isn't all that common).
So if the status quo with working trees (i.e., retroactively applying
the current gitattributes to past commits) is good enough, perhaps the
bare-repository solution should be modeled the same way? In other words,
should "git log -p" in a bare repo be looking for attributes not in the
commits being diffed, but rather in HEAD[2]?
That at least would make it consistent with the non-bare behavior. And
then if we want to move forward with making particular attributes more
aware of their context, we can do it in _both_ the bare and non-bare
cases.
-Peff
[1] The perl/python thing is probably not a huge deal, as funcnames are
the most likely thing to configure. But you can imagine it would be
much worse if some binary file changes formats, and you were using
textconv. Or something with word-diff, perhaps.
[2] You can almost do this with:
git show HEAD:.gitattributes >info/attributes
git show commit-you-care-about
rm info/attributes
except that it won't handle any per-directory attributes files from
subdirectories. So I think you'd want a "--attributes-from=HEAD"
diff option or something similar.
next prev parent reply other threads:[~2011-09-22 20:59 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-21 19:32 How to use git attributes to configure server-side checks? Michael Haggerty
2011-09-21 20:02 ` Jay Soffian
2011-09-21 20:17 ` Junio C Hamano
2011-09-22 8:28 ` Michael Haggerty
2011-09-22 15:41 ` Jay Soffian
2011-09-22 17:13 ` Jeff King
2011-09-22 18:41 ` Jay Soffian
2011-09-22 19:22 ` Junio C Hamano
2011-09-22 20:58 ` Jeff King [this message]
2011-09-22 21:04 ` Jeff King
2011-09-23 10:06 ` Michael Haggerty
2011-09-23 19:33 ` Jeff King
2011-09-23 19:40 ` Junio C Hamano
2011-09-23 19:44 ` Jeff King
2011-09-24 6:05 ` Michael Haggerty
2011-09-24 6:15 ` Jeff King
2011-09-24 11:03 ` Michael Haggerty
2011-09-26 4:09 ` Junio C Hamano
2011-09-26 4:28 ` Michael Haggerty
2011-09-26 11:05 ` Jeff King
2011-09-26 14:14 ` Jakub Narebski
2011-09-26 15:11 ` Michael Haggerty
2011-09-22 17:26 ` Junio C Hamano
2011-09-23 8:35 ` Michael Haggerty
2011-09-23 12:49 ` Stephen Bash
2011-09-23 13:31 ` Michael Haggerty
2011-09-22 22:54 ` Jakub Narebski
2011-09-23 10:38 ` Michael Haggerty
2012-02-17 18:42 ` Michael Haggerty
2012-02-17 19:26 ` Junio C Hamano
2012-02-17 19:59 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110922205856.GA8563@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jaysoffian@gmail.com \
--cc=mhagger@alum.mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).