* git-p4 using notes
From: Michael Horowitz @ 2011-12-16 16:07 UTC (permalink / raw)
To: git
For those of you using git-p4 because of a company requirement to use
Perforce, but really wish you could use git only, the most frustrating
part is the fact that when changes are submitted, the commit message
is rewritten to include a reference to the P4 change number which is
used by the sync. When syncing back changes, this causes the commit
hash to be different, and so blows away your old commit and any parent
commit references and such.
I read someplace, I can't remember where at this point, that if git-p4
used notes to write the P4 change information, that would not impact
the commit hash, so when merging back, things would not be
overwritten, and you can maintain branches and commit history properly
in git.
I just ran into this project, where it seems that someone has
re-written git-p4 to use notes: https://github.com/ermshiperete/git-p4
I was wondering if any of the maintainers of git-p4 has considered
this, and might want to leverage this work to merge into the main git
repo, possibly with an option to choose between the two behaviors.
Thanks,
Mike
^ permalink raw reply
* Re: How to generate pull-request with info of signed tag
From: Junio C Hamano @ 2011-12-16 16:22 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: Git Mailing List
In-Reply-To: <874nx1korf.fsf@linux.vnet.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
> I am using git from master branch and wanted to try the signed pull
> request. I have pushed the signed tag to repo.or.cz, but not sure how to
> generate pull request with signed tag information ? git-pull-request
> insist on a branch on the server and put the branch details in the
> pull-request text, It do add tag description but not the tag name and
> still put "repo-name branch" name in the txt. Shouldn't that be
> "repo-name tag-name" so that one can cut-paste that in pull request ?
Yeah, you are right.
^ permalink raw reply
* Re* How to generate pull-request with info of signed tag
From: Junio C Hamano @ 2011-12-16 16:56 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: Git Mailing List, Junio C Hamano
In-Reply-To: <874nx1korf.fsf@linux.vnet.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
> I am using git from master branch and wanted to try the signed pull
> request. I have pushed the signed tag to repo.or.cz, but not sure how to
> generate pull request with signed tag information?
The correct script should grok the following command line:
$ git request-pull v1.7.7.4 git://git.kernel.org/pub/scm/git/git.git v1.7.7.5
and include
are available in the git repository at
git://git.kernel.org/.../git.git tag v1.7.7.5
for you to fetch changes up to 66c1...
but we didn't loosen the code that inspects the publishing repository to
allow asking for a tag that points at an older part of the history to be
pulled.
Here is an update.
-- >8 --
Subject: request-pull: update the "pull" command generation logic
The old code that insisted on asking for the tip of a branch to be pulled
were not updated when we started allowing for a tag to be pulled. When a
tag points at an older part of the history and there is no branch that
points at the tagged commit, the script failed to say which ref is to be
pulled.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
git-request-pull.sh | 46 +++++++++++++++++++++++++++++++++++++---------
1 files changed, 37 insertions(+), 9 deletions(-)
diff --git a/git-request-pull.sh b/git-request-pull.sh
index c6a5b7a..7b5c777 100755
--- a/git-request-pull.sh
+++ b/git-request-pull.sh
@@ -57,12 +57,40 @@ headrev=$(git rev-parse --verify "$head"^0) || exit
merge_base=$(git merge-base $baserev $headrev) ||
die "fatal: No commits in common between $base and $head"
-find_matching_branch="/^$headrev "'refs\/heads\//{
- s/^.* refs\/heads\///
- p
- q
-}'
-branch=$(git ls-remote "$url" | sed -n -e "$find_matching_branch")
+# $head is the token given from the command line. If a ref with that
+# name exists at the remote and their values match, we should use it.
+# Otherwise find a ref that matches $headrev.
+find_matching_ref='
+ sub abbr {
+ my $ref = shift;
+ if ($ref =~ s|refs/heads/||) {
+ return $ref;
+ } elsif ($ref =~ s|refs/tags/||) {
+ return "tag $ref";
+ } else {
+ return $ref;
+ }
+ }
+
+ my ($exact, $found);
+ while (<STDIN>) {
+ my ($sha1, $ref, $deref) = /^(\S+)\s+(\S+?)(\^\{\})?$/;
+ next unless ($sha1 eq $ARGV[1]);
+ $found = abbr($ref);
+ if ($ref =~ m|/\Q$ARGV[0]\E$|) {
+ $exact = $found;
+ last;
+ }
+ }
+ if ($exact) {
+ print "$exact\n";
+ } elsif ($found) {
+ print "$found\n";
+ }
+'
+
+ref=$(git ls-remote "$url" | perl -e "$find_matching_ref" "$head" "$headrev")
+
url=$(git ls-remote --get-url "$url")
git show -s --format='The following changes since commit %H:
@@ -71,7 +99,7 @@ git show -s --format='The following changes since commit %H:
are available in the git repository at:
' $baserev &&
-echo " $url${branch+ $branch}" &&
+echo " $url${ref+ $ref}" &&
git show -s --format='
for you to fetch changes up to %H:
@@ -81,7 +109,7 @@ for you to fetch changes up to %H:
if test -n "$branch_name"
then
- echo "(from the branch description for $branch local branch)"
+ echo "(from the branch description for $branch_name local branch)"
echo
git config "branch.$branch_name.description"
fi &&
@@ -101,7 +129,7 @@ fi &&
git shortlog ^$baserev $headrev &&
git diff -M --stat --summary $patch $merge_base..$headrev || status=1
-if test -z "$branch"
+if test -z "$ref"
then
echo "warn: No branch of $url is at:" >&2
git show -s --format='warn: %h: %s' $headrev >&2
^ permalink raw reply related
* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
From: Junio C Hamano @ 2011-12-16 17:01 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Jeff King, git, Brandon Casey
In-Reply-To: <4EEB4F13.2010402@viscovery.net>
Johannes Sixt <j.sixt@viscovery.net> writes:
> Am 12/16/2011 12:00, schrieb Jeff King:
>> static const char *builtin_attr[] = {
> ...
>> + "*.c diff=cpp",
>> + "*.cc diff=cpp",
>> + "*.cxx diff=cpp",
>> + "*.cpp diff=cpp",
>> + "*.h diff=cpp",
>> + "*.hpp diff=cpp",
>
> Please don't do this. It would be a serious regression for C++ coders, and
> some C coders as well. The built-in hunk header patterns are severly
> broken and don't work well with C++ code.
I do not often do C++, but I tend to agree wrt C.
^ permalink raw reply
* Re: Any tips for improving the performance of cloning large repositories?
From: Junio C Hamano @ 2011-12-16 17:08 UTC (permalink / raw)
To: Alex Bennee; +Cc: Seth Robertson, Hallvard Breien Furuseth, git
In-Reply-To: <CAJ-05NPbRmyx=a+U7BK4rNShBgaXj+g-Bwc1aBDDb3N0VPBW=A@mail.gmail.com>
Alex Bennee <kernel-hacker@bennee.com> writes:
> Well that's counter intuitive....
>
> - reverting the original repo to one big pack speeds up the clone
> - adding a --local --reference mirror slows it down
Neither is. Read what "--local" says in the help text of clone. It
disables the git aware clever optimization.
^ permalink raw reply
* Re: [PATCH v3 2/3] grep: enable threading with -p and -W using lazy attribute lookup
From: Junio C Hamano @ 2011-12-16 17:34 UTC (permalink / raw)
To: Johannes Sixt
Cc: Thomas Rast, git, Junio C Hamano, René Scharfe, Jeff King,
Eric Herman
In-Reply-To: <4EEAFFAF.8030003@viscovery.net>
Johannes Sixt <j.sixt@viscovery.net> writes:
> This is the first time we use pthread_mutex_t in a header file. We need at
> least the following squashed in. An alternative would be to include
> "thread-utils.h", but thread-utils is really more about implementation
> helpers functions, not about types,...
builtin/grep.c already uses thread-utils.h since 5b594f4 (Threaded grep,
2010-01-25), so does builtin/pack-objects.c since 833e3df (pack-objects:
Add runtime detection of online CPU's, 2008-02-22), so it may be simpler
to do so in grep.h instead.
diff --git a/builtin/grep.c b/builtin/grep.c
index bc23c3c..6474eed 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -17,7 +17,6 @@
#include "grep.h"
#include "quote.h"
#include "dir.h"
-#include "thread-utils.h"
static char const * const grep_usage[] = {
"git grep [options] [-e] <pattern> [<rev>...] [[--] <path>...]",
diff --git a/grep.h b/grep.h
index 15d227c..dd4c65e 100644
--- a/grep.h
+++ b/grep.h
@@ -133,8 +133,11 @@ extern struct grep_opt *grep_opt_dup(const struct grep_opt *opt);
extern int grep_threads_ok(const struct grep_opt *opt);
#ifndef NO_PTHREADS
-/* Mutex used around access to the attributes machinery if
- * opt->use_threads. Must be initialized/destroyed by callers! */
+#include "thread-utils.h"
+/*
+ * Mutex used around access to the attributes machinery if
+ * opt->use_threads. Must be initialized/destroyed by callers!
+ */
extern pthread_mutex_t grep_attr_mutex;
#endif
^ permalink raw reply related
* [BUG] attribute "eol" with "crlf"
From: Ralf Thielow @ 2011-12-16 17:44 UTC (permalink / raw)
To: git
There's a bug in git-1.7.8 if you use the attribute "eol" with "crlf".
Steps to reproduce:
- add and commit a text file which uses 0d0a for line breaks
7465 7374 0d0a 0d0a 7465 7374 0d0a test....test..
- add ".gitattributes" with "*.txt eol=crlf"
- change a line in the file
- execute "git checkout [file]"
The result is:
7465 7374 0d0d 0a0d 0d0a 7465 7374 0d0d test......test..
0d0a was replaced by 0d0d0a.
^ permalink raw reply
* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
From: Mark Levedahl @ 2011-12-16 17:51 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git, Brandon Casey
In-Reply-To: <20111216110000.GA15676@sigill.intra.peff.net>
On 12/16/2011 06:00 AM, Jeff King wrote:
> + "*.m diff=objc",
Please don't do this: Matlab files also use .m as a suffix, and there is
little to no compatibility between objective c and Matlab syntax.
Mark
^ permalink raw reply
* [PATCH] Adding hooks.directory config option; wiring into run_hook
From: Christopher Dale @ 2011-12-16 18:00 UTC (permalink / raw)
To: git
>From 92a34696bb4e76ae7a967666234f13d04858bb68 Mon Sep 17 00:00:00 2001
From: Christopher Dale <chrelad@gmail.com>
Date: Fri, 16 Dec 2011 10:55:26 -0600
Subject: [PATCH] Adding hooks.directory config option; wiring into run_hook
The new hooks.directory config option allows each git repository to
specify the location of the hooks for that repository. The default
remains $GIT_DIR/hooks. The ability to change the hooks directory is
necessary when stuck in an environment with enhanced security and
trusted path execution policies. These systems require that any file
that can be executed exhibit at least the following characteristics:
* The executable, it's directory, and each directory above it are
not writable.
Since the hooks directory is within a directory that by it's very nature
requires write permissions, hooks are a non-starter in git's current
state. This patch aims to allow a (most likely bare) repo to have it's
hooks directory customized to a location that meets the above
requirements.
I'm not terribly good at C++, so please let me know if I need to fix
anything. I saw that there were a bunch of scripts that have
GIT_DIR/hooks hard-coded in them. Since I'm not familiar with those
scripts, I will leave them alone for now. Maybe someone that is familiar
with the scripts can integrate the new configuration option into them?
---
Documentation/config.txt | 5 +++++
contrib/completion/git-completion.bash | 1 +
run-command.c | 25 +++++++++++++++++++++++--
3 files changed, 29 insertions(+), 2 deletions(-)
diff --git a/Documentation/config.txt b/Documentation/config.txt
index 8a7d2d4..c23417c 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1226,6 +1226,11 @@ help.autocorrect::
value is 0 - the command will be just shown but not executed.
This is the default.
+hooks.directory::
+ Override the default hook directory location GIT_DIR/hooks. This can be
+ usefull if you are in an environment that has trusted path execution for
+ example.
+
http.proxy::
Override the HTTP proxy, normally configured using the 'http_proxy'
environment variable (see linkgit:curl[1]). This can be overridden
diff --git a/contrib/completion/git-completion.bash
b/contrib/completion/git-completion.bash
index cc1bdf9..066948e 100755
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -2177,6 +2177,7 @@ _git_config ()
help.autocorrect
help.browser
help.format
+ hooks.directory
http.lowSpeedLimit
http.lowSpeedTime
http.maxRequests
diff --git a/run-command.c b/run-command.c
index 1c51043..2e5fa16 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1,4 +1,5 @@
#include "cache.h"
+#include "diff.h"
#include "run-command.h"
#include "exec_cmd.h"
#include "argv-array.h"
@@ -65,6 +66,7 @@ static int execv_shell_cmd(const char **argv)
#ifndef WIN32
static int child_err = 2;
static int child_notifier = -1;
+static const char *hook_directory = NULL;
static void notify_parent(void)
{
@@ -603,6 +605,14 @@ int finish_async(struct async *async)
#endif
}
+static int git_hook_config(const char *var, const char *value, void *cb)
+{
+ if (!strcmp(var, "hooks.directory"))
+ return git_config_pathname(&hook_directory, var, value);
+
+ return git_diff_ui_config(var, value, cb);
+}
+
int run_hook(const char *index_file, const char *name, ...)
{
struct child_process hook;
@@ -612,11 +622,22 @@ int run_hook(const char *index_file, const char
*name, ...)
va_list args;
int ret;
- if (access(git_path("hooks/%s", name), X_OK) < 0)
+ // If this is not reset to NULL, then strange stuff happens
+ hook_directory = NULL;
+
+ // Load the configuration for hooks.directory
+ git_config(git_hook_config, NULL);
+
+ // If the configuration is not set for hooks directory, set it to the
+ // default GIT_PATH/hooks directory that we all know and love.
+ if(hook_directory == NULL)
+ hook_directory = git_path("hooks");
+
+ if (access(mkpath("%s/%s", hook_directory, name), X_OK) < 0)
return 0;
va_start(args, name);
- argv_array_push(&argv, git_path("hooks/%s", name));
+ argv_array_push(&argv, mkpath("%s/%s", hook_directory, name));
while ((p = va_arg(args, const char *)))
argv_array_push(&argv, p);
va_end(args);
--
1.7.5.2.353.g5df3e
^ permalink raw reply related
* Re: [BUG] attribute "eol" with "crlf"
From: Junio C Hamano @ 2011-12-16 18:03 UTC (permalink / raw)
To: Ralf Thielow; +Cc: git
In-Reply-To: <CAN0XMO+OOdTJ+aNMSc2G3RVc7Wfypr4+7dU3US9GVAmMiSJ7cg@mail.gmail.com>
Can you bisect?
^ permalink raw reply
* Re: [PATCH] Adding hooks.directory config option; wiring into run_hook
From: Junio C Hamano @ 2011-12-16 18:06 UTC (permalink / raw)
To: Christopher Dale; +Cc: git
In-Reply-To: <CADQnX_e76LzuRUDOKFOsRHU=e8Cw+qh5x1BdW5HMEdMmP5PaHg@mail.gmail.com>
Christopher Dale <chrelad@gmail.com> writes:
> ...
> trusted path execution policies. These systems require that any file
> that can be executed exhibit at least the following characteristics:
>
> * The executable, it's directory, and each directory above it are
> not writable.
>
> Since the hooks directory is within a directory that by it's very nature
> requires write permissions,...
Sorry, but I am not interested at all. If you can write into $GIT_DIR/config
then you can point at any directory with hooks.directory and that would mean
it would defeat your "trusted path execution policies".
^ permalink raw reply
* Re: [BUG] attribute "eol" with "crlf"
From: Matthieu Moy @ 2011-12-16 18:21 UTC (permalink / raw)
To: Ralf Thielow; +Cc: git
In-Reply-To: <CAN0XMO+OOdTJ+aNMSc2G3RVc7Wfypr4+7dU3US9GVAmMiSJ7cg@mail.gmail.com>
Ralf Thielow <ralf.thielow@googlemail.com> writes:
> There's a bug in git-1.7.8 if you use the attribute "eol" with "crlf".
>
> Steps to reproduce:
> - add and commit a text file which uses 0d0a for line breaks
> 7465 7374 0d0a 0d0a 7465 7374 0d0a test....test..
> - add ".gitattributes" with "*.txt eol=crlf"
> - change a line in the file
> - execute "git checkout [file]"
>
> The result is:
> 7465 7374 0d0d 0a0d 0d0a 7465 7374 0d0d test......test..
It seems to me to be the expected behavior. You committed a file whose
line endings are not normalized to LF in the repository, and asked for a
conversion LF -> CRLF on checkout, which Git did.
Git can't know exactly the moment when you edit .gitattributes, so it
can't do the conversion at the time you add the eol=crlf attribute. It
does it on checkout.
> 0d0a was replaced by 0d0d0a.
I'd say 0a (LF) was replaced by 0d0a (CRLF).
What behavior would you have expected?
--
Matthieu Moy
http://www-verimag.imag.fr/~moy/
^ permalink raw reply
* Re: [PATCH] Adding hooks.directory config option; wiring into run_hook
From: Christopher Dale @ 2011-12-16 18:28 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vmxasie6k.fsf@alter.siamese.dyndns.org>
Okidokee, thanks for your feedback :)
On Fri, Dec 16, 2011 at 12:06 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Christopher Dale <chrelad@gmail.com> writes:
>
>> ...
>> trusted path execution policies. These systems require that any file
>> that can be executed exhibit at least the following characteristics:
>>
>> * The executable, it's directory, and each directory above it are
>> not writable.
>>
>> Since the hooks directory is within a directory that by it's very nature
>> requires write permissions,...
>
> Sorry, but I am not interested at all. If you can write into $GIT_DIR/config
> then you can point at any directory with hooks.directory and that would mean
> it would defeat your "trusted path execution policies".
^ permalink raw reply
* Re: [BUG] attribute "eol" with "crlf"
From: Ralf Thielow @ 2011-12-16 18:28 UTC (permalink / raw)
To: Matthieu Moy; +Cc: git
In-Reply-To: <vpqr504wf70.fsf@bauges.imag.fr>
> What behavior would you have expected?
I've expected that git doesn't change the line endings
because it's already CRLF.
^ permalink raw reply
* Re: [msysGit] Windows & executable bit for newly created files
From: Dirk Süsserott @ 2011-12-16 18:27 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <7vwr9xmu48.fsf@alter.siamese.dyndns.org>
Am 15.12.2011 21:56 schrieb Junio C Hamano:
> Dirk Süsserott <newsletter@dirk.my1.cc> writes:
>
>> Is there a way to convince git that the new mode is 755 instead of 644,
>> even with core.filemode set to false? So that the mode is correct when I
>> checkout the file under Linux later on?
>
> "git update-index --chmod=+x"?
>
Oh, thanks. I wasn't aware of that option. Works exactly as I like.
Dirk
^ permalink raw reply
* Re: Any tips for improving the performance of cloning large repositories?
From: Alex Bennee @ 2011-12-16 18:37 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Seth Robertson, Hallvard Breien Furuseth, git
In-Reply-To: <7vzkesigw9.fsf@alter.siamese.dyndns.org>
On 16 December 2011 17:08, Junio C Hamano <gitster@pobox.com> wrote:
> Alex Bennee <kernel-hacker@bennee.com> writes:
>
>> Well that's counter intuitive....
>>
>> - reverting the original repo to one big pack speeds up the clone
>> - adding a --local --reference mirror slows it down
>
> Neither is. Read what "--local" says in the help text of clone. It
> disables the git aware clever optimization.
OK that's not how I read the man page:
--local, -l
When the repository to clone from is on a local machine,
this flag bypasses the normal "git aware" transport
mechanism and clones the repository by making a copy of
HEAD and everything under objects and refs directories.
So this says it skips "git aware" (whatever that means)
The files under .git/objects/ directory are hardlinked to
save space when possible. This is now the default when
the source repository is specified with /path/to/repo
syntax, so it essentially is a no-op option. To force
copying instead of hardlinking (which may be desirable if
you are trying to make a back-up of your repository),
but still avoid the usual "git aware" transport mechanism,
--no-hardlinks can be used.
And this says that objects on the local file-system are hardlinked
(rather than copied) which I assumed was a optimal approach.
--no-hardlinks
Optimize the cloning process from a repository on a local
filesystem by copying files under .git/objects
directory.
I'm not sure how this is an optimization? This means more copying
rather than linking right?
--
Alex, homepage: http://www.bennee.com/~alex/
http://www.half-llama.co.uk
^ permalink raw reply
* Re: [BUG] attribute "eol" with "crlf"
From: Ralf Thielow @ 2011-12-16 18:49 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vr504ieco.fsf@alter.siamese.dyndns.org>
> Can you bisect?
e322ee38ad8d655f5a32b3482ae9ce813b73e4bc
^ permalink raw reply
* Re: Revisiting metadata storage
From: Hilco Wijbenga @ 2011-12-16 18:55 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: Richard Hartmann, Ronan Keryell, Git List
In-Reply-To: <20111216075251.GA4048@elie.hsd1.il.comcast.net>
On 15 December 2011 23:52, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Hilco Wijbenga wrote:
>
>> Right now every rebase means a full (and almost
>> completely unnecessary) rebuild.
>
> It sounds like what you are suffering from is that "git rebase" uses
> the worktree as its workspace instead of doing all that work
> in-memory, right?
Yes, I guess the problem is that it uses the worktree as its workspace.
(I know others disagree but to me it's a bug that Git touches files
that it doesn't actually change.)
> If I were in your situation, I would do the following:
>
> 1. Don't rebase so often. When wanting to take advantage of features
> from a new upstream version, use "git merge" to pull it in. Only
> rebase when it is time to make the history presentable for other
> people.
I usually rebase in the morning to get an up-to-date tree. Is that
considered too often? Perhaps it's my Subversion background but I'm
not comfortable diverging too much. Is that too paranoid? :-)
So IIUC, I can do "git rebase master" even after multiple "git merge master"s?
> This way, "git log --first-parent" will give easy access to
> the intermediate versions you have hacked on and tested recently.
Why is "git log --first-parent" important? I read "git help log" on
first-parent but that didn't really tell me much. Google was not very
helpful either.
> 2. When history gets ugly and you want to rebase to make the series
> easier to make sense of, use a separate workdir:
>
> $ git branch tmp; # make a copy to rebase
This is in my merged branch, right?
>
> $ cd ..
> $ git new-workdir repo rebase-scratch tmp
> $ cd rebase-scratch
> $ git rebase -i origin/master
> ...
> $ cd ..
> $ rm -fr rebase-scratch
>
> $ cd repo
> $ git diff HEAD tmp; # Does the rebased version look better?
> $ git reset --keep tmp; # Yes. Use it.
> $ git branch -d tmp
Interesting. If I run the rebase after the merge, rebase appears to do
much less work. I.e. it appears to only touch files that have actually
changed. Is that true?
> 3. Once the rebased history looks reasonably good, be sure to rebase
> one final time and test each commit before submitting for other
> people's use.
>
> Hope that helps,
Yes, thanks for pointing out yet more useful Git options. There seems
no end to them. :-)
Cheers,
Hilco
^ permalink raw reply
* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
From: Jeff King @ 2011-12-16 19:21 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Junio C Hamano, git, Brandon Casey
In-Reply-To: <4EEB4F13.2010402@viscovery.net>
On Fri, Dec 16, 2011 at 03:00:51PM +0100, Johannes Sixt wrote:
> Am 12/16/2011 12:00, schrieb Jeff King:
> > static const char *builtin_attr[] = {
> ...
> > + "*.c diff=cpp",
> > + "*.cc diff=cpp",
> > + "*.cxx diff=cpp",
> > + "*.cpp diff=cpp",
> > + "*.h diff=cpp",
> > + "*.hpp diff=cpp",
>
> Please don't do this. It would be a serious regression for C++ coders, and
> some C coders as well. The built-in hunk header patterns are severly
> broken and don't work well with C++ code. I know for sure that the
> following are not recognized:
>
> - template declarations, e.g. template<class T> func(T x);
> - constructor definitionss, e.g. MyClass::MyClass()
> - functions that return references, e.g. const string& func()
> - function definitions along the GNU coding style, e.g.
>
> void
> the_func ()
Hmm. I think it's a legitimate criticism to say "hunk-header detection
is a broken feature because our heuristics aren't good enough, and we
shouldn't start using it by default because people will complain because
it sucks too much".
At the same time, I think we have seen people complaining that the
regular dumb funcname detection is not good enough[1], and that using
language-specific funcnames, while not 100% perfect, produces better
results on the whole.
So I think rather than saying "this doesn't always work", it's important
to ask "on the whole, does this tend to produce better results than
without, and when we are wrong, how bad is it?"
I'm not clear from what you wrote on whether you were saying it is
simply sub-optimal, or whether on balance it is way worse than the
default funcname matching.
And if it is bad on balance, is the right solution to avoid exposing
people to it, or is it to make our patterns better? I.e., is it fixable,
or is it simply too hard a problem to get right in the general case, and
we shouldn't turn it on by default?
> I am currently using this pattern (but I'm sure it can be optimized) with
> an appropriate xcpp attribute:
>
> [diff "xcpp"]
> xfuncname = "!^[
> \\t]*[a-zA-Z_][a-zA-Z_0-9]*[^()]*:[[:space:]]*$\n^[a-zA-Z_][a-zA-Z_0-9]*.*"
So, I'm confused. If you are using this, surely you have "*.c diff=xcpp"
in your attributes file, and my patch has no effect for you, as it is
lower precedence than user-supplied gitattributes? Also, if you called
it diff.cpp.xfuncname, then wouldn't my patch still be useful, as your
complaint is not "my *.c files are not actually C language" but "the C
language driver sucks" (but you be remedying that by providing your own
config).
-Peff
^ permalink raw reply
* Re: [PATCH] Adding hooks.directory config option; wiring into run_hook
From: Junio C Hamano @ 2011-12-16 19:23 UTC (permalink / raw)
To: Christopher Dale; +Cc: git
In-Reply-To: <7vmxasie6k.fsf@alter.siamese.dyndns.org>
Junio C Hamano <gitster@pobox.com> writes:
> Christopher Dale <chrelad@gmail.com> writes:
>
>> ...
>> trusted path execution policies. These systems require that any file
>> that can be executed exhibit at least the following characteristics:
>>
>> * The executable, it's directory, and each directory above it are
>> not writable.
>>
>> Since the hooks directory is within a directory that by it's very nature
>> requires write permissions,...
>
> Sorry, but I am not interested at all. If you can write into $GIT_DIR/config
> then you can point at any directory with hooks.directory and that would mean
> it would defeat your "trusted path execution policies".
I was about to follow-up with "the only option that may make sense in such
an environment may be to disable hooks", but after thinking about it more,
if somebody enables hooks, the environment will make sure that they will
fail to execute, and it would be an incentive enough for people to disable
them. IOW, no need to have such an option, even.
^ permalink raw reply
* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
From: Philip Oakley @ 2011-12-16 19:26 UTC (permalink / raw)
To: Jeff King, Junio C Hamano; +Cc: git, Brandon Casey
In-Reply-To: <20111216110000.GA15676@sigill.intra.peff.net>
From: "Jeff King" <peff@peff.net>
> We already provide sane hunk-header patterns for specific
> languages.
>
> However, the user has to manually map common extensions to
> use them. It's not that hard to do, but it's an extra step
> that the user might not even know is an option. Let's be
> nice and do it automatically.
>
> It could be a problem in the future if the builtin userdiff
> drivers started growing more invasive options, like
> automatically claiming to be non-binary (e.g., setting
> diff.cpp.binary = false by default), but right now we do not
> do that, so it should be safe. To help safeguard against
> future changes, we add a new test to t4012 making sure that
> we don't consider binary files as text by their extension.
>
> We also have to update t4018, which assumed that without a
> .gitattributes file, we would receive the default funcname
> pattern for a file matching "*.java". Changing this behavior
> is not covering up a regression, but rather the feature
> working as intended.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I forgot to send this out in time for v1.7.8.
>
> Prior discussion here:
>
> http://thread.gmane.org/gmane.comp.version-control.git/180103
>
> and here:
>
> http://thread.gmane.org/gmane.comp.version-control.git/181253
>
> The list of extensions is collected from those threads. The tests are
> new since the last time I posted (and the t4018 is slightly different
> than what you queued in pu).
>
> I punted on the question of case-sensitivity. Brandon mentioned using
> fnmatch_icase to handle this, which sounds sane, but I think it is
> really a separate topic.
>
> attr.c | 24 ++++++++++++++++++++++++
> t/t4012-diff-binary.sh | 13 +++++++++++++
> t/t4018-diff-funcname.sh | 10 +++++++++-
> 3 files changed, 46 insertions(+), 1 deletions(-)
>
> diff --git a/attr.c b/attr.c
> index 76b079f..2ad7cc4 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -306,6 +306,30 @@ static void free_attr_elem(struct attr_stack *e)
>
> static const char *builtin_attr[] = {
> "[attr]binary -diff -text",
> + "*.html diff=html",
> + "*.htm diff=html",
> + "*.java diff=java",
> + "*.perl diff=perl",
> + "*.pl diff=perl",
> + "*.php diff=php",
> + "*.py diff=python",
> + "*.rb diff=ruby",
> + "*.bib diff=bibtex",
> + "*.tex diff=tex",
> + "*.c diff=cpp",
> + "*.cc diff=cpp",
> + "*.cxx diff=cpp",
> + "*.cpp diff=cpp",
> + "*.h diff=cpp",
> + "*.hpp diff=cpp",
> + "*.cs diff=csharp",
> + "*.[Ff] diff=fortran",
> + "*.[Ff][0-9][0-9] diff=fortran",
> + "*.m diff=objc",
There is a conflict here with the Matlab community which also uses "*.m"
files for its scripts and code.
They fit the "It's not that hard to do, but it's an extra step that the user
might not even know is an option." rationale.
If the objc.m is used as a default it must be overidable easily, and listed
in the appropriate documentation to mitigate the "might not even know" risk.
Philip
> + "*.mm diff=objc",
> + "*.pas diff=pascal",
> + "*.pp diff=pascal",
> + "*.lpr diff=pascal",
> NULL,
> };
>
> diff --git a/t/t4012-diff-binary.sh b/t/t4012-diff-binary.sh
> index 2d9f9a0..b2fc807 100755
> --- a/t/t4012-diff-binary.sh
> +++ b/t/t4012-diff-binary.sh
> @@ -90,4 +90,17 @@ test_expect_success 'diff --no-index with binary
> creation' '
> test_cmp expected actual
> '
>
> +test_expect_success 'binary files are not considered text by file
> extension' '
> + echo Q | q_to_nul >binary.c &&
> + git add binary.c &&
> + cat >expect <<-\EOF &&
> + diff --git a/binary.c b/binary.c
> + new file mode 100644
> + index 0000000..1f2a4f5
> + Binary files /dev/null and b/binary.c differ
> + EOF
> + git diff --cached binary.c >actual &&
> + test_cmp expect actual
> +'
> +
> test_done
> diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
> index 4bd2a1c..a6227ef 100755
> --- a/t/t4018-diff-funcname.sh
> +++ b/t/t4018-diff-funcname.sh
> @@ -124,7 +124,9 @@ do
> done
>
> test_expect_success 'default behaviour' '
> - rm -f .gitattributes &&
> + cat >.gitattributes <<-\EOF &&
> + *.java diff=default
> + EOF
> test_expect_funcname "public class Beer\$"
> '
>
> @@ -187,4 +189,10 @@ test_expect_success 'alternation in pattern' '
> test_expect_funcname "public static void main("
> '
>
> +test_expect_success 'custom diff drivers override built-in extension
> matches' '
> + test_config diff.foo.funcname "int special" &&
> + echo "*.java diff=foo" >.gitattributes &&
> + test_expect_funcname "int special"
> +'
> +
> test_done
> --
> 1.7.7.4.13.g57bf4
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.1890 / Virus Database: 2108/4682 - Release Date: 12/15/11
>
^ permalink raw reply
* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
From: Jeff King @ 2011-12-16 19:28 UTC (permalink / raw)
To: Mark Levedahl; +Cc: Junio C Hamano, git, Brandon Casey
In-Reply-To: <4EEB8517.2030304@gmail.com>
On Fri, Dec 16, 2011 at 12:51:19PM -0500, Mark Levedahl wrote:
> On 12/16/2011 06:00 AM, Jeff King wrote:
>
> >+ "*.m diff=objc",
>
> Please don't do this: Matlab files also use .m as a suffix, and there
> is little to no compatibility between objective c and Matlab syntax.
Thanks for the feedback. Unlike JSixt's objection, I think this one is
at the heart of the patch: using file extensions to map to file types is
just a heuristic, and that heuristic can be spectacularly wrong.
And that's why we took the conservative approach until now, and simply
left it up to projects to define their own attributes mapping files to
types (even though we provided funcname patterns for some types).
So I think it is really worth weighing the convenience of "user does not
have to bother configuring attributes for each project" versus "we might
get it wrong".
Fortunately, the "might get it wrong" side is pretty easily mitigated by
making .gitattributes file (i.e., the same thing they would have to do
without this mapping heuristic). So the question is not "did we get it
wrong", but "how much worse is the objc funcname pattern versus the
default one for matlab files". I'd be interested to hear results from
Matlab people.
And of course there's the question of how good or bad each heuristic is.
It sounds like ".c" is more likely to be C than ".m" is to be objc, for
example. So maybe the concept is sound, but "*.m" is too overloaded an
extension to make the default list. I dunno.
-Peff
^ permalink raw reply
* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
From: Jeff King @ 2011-12-16 19:30 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Junio C Hamano, git, Brandon Casey
In-Reply-To: <20111216192104.GA19924@sigill.intra.peff.net>
On Fri, Dec 16, 2011 at 02:21:04PM -0500, Jeff King wrote:
> At the same time, I think we have seen people complaining that the
> regular dumb funcname detection is not good enough[1], and that using
> language-specific funcnames, while not 100% perfect, produces better
> results on the whole.
I forgot to include my footnote, but it was:
[1] We've seen requests on the list, and we also receive similar
requests at GitHub for web-based diffs to use better funcnames. We
just enabled the mapping ourselves a week or two ago via a system
/etc/gitattributes file.
^ permalink raw reply
* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
From: Jeff King @ 2011-12-16 19:32 UTC (permalink / raw)
To: Philip Oakley; +Cc: Junio C Hamano, git, Brandon Casey
In-Reply-To: <A8E08CC616E248EC8F9000DD86F228E0@PhilipOakley>
On Fri, Dec 16, 2011 at 07:26:00PM -0000, Philip Oakley wrote:
> >+ "*.m diff=objc",
>
> There is a conflict here with the Matlab community which also uses
> "*.m" files for its scripts and code.
> They fit the "It's not that hard to do, but it's an extra step that
> the user might not even know is an option." rationale.
>
> If the objc.m is used as a default it must be overidable easily, and
> listed in the appropriate documentation to mitigate the "might not
> even know" risk.
It is easily overridable; just put your own "*.m" (or anything that
matches your files) entry into your gitattributes file. I'm more
concerned that people will start getting worse results than with the
default, and not know how to fix it.
If you have some Matlab files, would you mind doing diffs with the
default driver and with the objc driver, and comment on how good or bad
the results are?
-Peff
^ permalink raw reply
* Re: [PATCH] attr: map builtin userdiff drivers to well-known extensions
From: Junio C Hamano @ 2011-12-16 19:33 UTC (permalink / raw)
To: Jeff King; +Cc: Johannes Sixt, git, Brandon Casey
In-Reply-To: <20111216192104.GA19924@sigill.intra.peff.net>
Jeff King <peff@peff.net> writes:
> I'm not clear from what you wrote on whether you were saying it is
> simply sub-optimal, or whether on balance it is way worse than the
> default funcname matching.
I think we recently saw that the optional built-in one for C did not even
understand a function that returns a pointer, and nobody complained about
it for a long time, and what was even more funny was that a patch to fix
it got a comment from somebody who wasn't using the optional built-in one
saying "The default works fine; what problem are you fixing?". That is
clearly one example where the default one is still better than the pattern
based one, iow, the pattern based one is way premature to be turned on by
default.
> And if it is bad on balance, is the right solution to avoid exposing
> people to it, or is it to make our patterns better?
Can't we do both, by avoid exposing normal users to broken one while
people who want to improve the pattern based one work on unbreak it?
> I.e., is it fixable,
> or is it simply too hard a problem to get right in the general case, and
> we shouldn't turn it on by default?
I do not think that is the "either-or" question. My impression has been
that even if it is fixable, it is too broken and produces worse result
than the simple default in its current form.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox