* Re: [PATCH] format-patch: avoid generation of empty patches
From: Junio C Hamano @ 2009-01-10 20:41 UTC (permalink / raw)
To: Nathan W. Panike; +Cc: Alexander Potashev, git
In-Reply-To: <d77df1110901100801s463bb43bt701a95df14f167d8@mail.gmail.com>
"Nathan W. Panike" <nathan.panike@gmail.com> writes:
> On Sat, Jan 10, 2009 at 5:39 AM, Alexander Potashev
> <aspotashev@gmail.com> wrote:
> ...
>>
>> + if (!commit->parents && !rev.show_root_diff)
>> + break;
>
> Do you really want to stop getting commits? It seems like the break
> statement here should be a continue.
You can give a commit range that has two independent roots. The above
"break" is wrong.
The variable is called show_root_DIFF, not show_root_COMMIT; even if you
have "log.showroot = false", "git log -p" output would still give you the
initial commit, but without the patch text, no?
But that is not Alexander's fault; it is mine.
I think "log -p" and "format-patch" can and should behave differently in
this case. "log -p" is for people who already _have_ the history and
would want to inspect how it evolved, and it is reasonable if some people
want to say "the very initial huge import is not interesting to me while
reviewing the history", and turning it off makes sense for them (in fact,
the default was initially that way).
On the other hand, "format-patch" is about exporting a part of your
history so that you can mechanincally replay it elsewhere, and I do not
think of a reasonable justification not to export a root commit fully if
the range user asked for happens to contain one.
I agree with Alexander that we should not output just the message without
the patch text, but I think the right solution is to show both. not to
skip root.
-- >8 --
format-patch: show patch text for the root commit
Even without --root specified, if the range given on the command line
happens to include a root commit, we should include its patch text in the
output.
This fix deliberately ignores log.showroot configuration variable because
"format-patch" and "log -p" can and should behave differently in this
case, as the former is about exporting a part of your history in a form
that is replayable elsewhere and just giving the commit log message
without the patch text does not make any sense for that purpose.
Noticed and fix originally attempted by Nathan W. Panike; credit goes to
Alexander Potashev for injecting sanity to my initial (broken) fix that
used the value from log.showroot configuration, which was misguided.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
builtin-log.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git c/builtin-log.c w/builtin-log.c
index 4a02ee9..91e5412 100644
--- c/builtin-log.c
+++ w/builtin-log.c
@@ -935,6 +935,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
* get_revision() to do the usual traversal.
*/
}
+
+ /*
+ * We cannot move this anywhere earlier because we do want to
+ * know if --root was given explicitly from the comand line.
+ */
+ rev.show_root_diff = 1;
+
if (cover_letter) {
/* remember the range */
int i;
^ permalink raw reply related
* TortoiseGit
From: Victor Westmann @ 2009-01-10 22:46 UTC (permalink / raw)
To: git
Hi everyone.
How can I help translate this nice piece of software to my native
language? (pt_BR = Brazilian portuguese).
Cheers.
--
[]s
Victor Westmann
Joan Crawford - "I, Joan Crawford, I believe in the dollar.
Everything I earn, I spend."
^ permalink raw reply
* Re: [PATCH/RFC v5 1/1] more cache effective symlink/directory detection
From: Junio C Hamano @ 2009-01-11 0:48 UTC (permalink / raw)
To: René Scharfe; +Cc: Kjetil Barvik, git, Pete Harlan, Linus Torvalds
In-Reply-To: <49687440.5090506@lsrfire.ath.cx>
René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
> Kjetil Barvik schrieb:
>> - Also introduce a 'void clear_lstat_cache(void)' function, which
>> should be used to clean the cache before usage. If for instance,
>> you have changed the types of directories which should be cached,
>> the cache could contain a path which was not wanted.
>
> Is it possible to make the cache detect these situations automatically
> by saving track_flags along with the cache contents? Not having to
> clear the cache manually would be a major feature.
> Also, it's probably worth to split this patch up again. First switching
> to your improved implementation of has_symlink_leading_path(), then
> introducing has_symlink_or_noent_leading_path() and finally adding
> LSTAT_FULLPATH and the fourth parameter of lstat_cache() etc. and using
> this feature in entry.c seems like a nice incremental progression.
Both are reasonable suggestions. Thanks.
^ permalink raw reply
* Re: [PATCH v4] submodule: add --no-fetch parameter to update command
From: Nanako Shiraishi @ 2009-01-11 1:31 UTC (permalink / raw)
To: Fabian Franz; +Cc: git, j.sixt, hjemli, gitster
In-Reply-To: <1231553410-7541-2-git-send-email-git@fabian-franz.de>
Quoting Fabian Franz <git@fabian-franz.de> writes:
> git submodule update --no-fetch makes it possible to use git submodule
> update in complete offline mode by not fetching new revisions.
I may be confused, but does it make any sense to update without fetching?
--
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/
^ permalink raw reply
* Re: [PATCH v4] submodule: add --no-fetch parameter to update command
From: Nanako Shiraishi @ 2009-01-11 1:32 UTC (permalink / raw)
To: Fabian Franz; +Cc: git, j.sixt, hjemli, gitster
In-Reply-To: <1231553410-7541-2-git-send-email-git@fabian-franz.de>
Quoting Fabian Franz <git@fabian-franz.de> writes:
> When -u|--use-gitmodules is given the update command uses the .gitmodules
> config file instead of the .git/config file to obtain the track parameter.
> This makes it for example possible to change branches for all submodules
> at the same time without having to sync up .git/config first.
>
> Signed-off-by: Fabian Franz <git@fabian-franz.de>
Once you start doing that, doesn't it force you to always bypass .git/config, making the current mechanism even less useful? If you find that "having to sync up .git/config first" is inconvenient, wouldn't it be more useful for your patch if it made it easier and more convenient the procedure "to sync up .git/config first", instead of bypassing the current setup?
--
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/
^ permalink raw reply
* Re: [PATCH v4] submodule: add +<branch> syntax for track to allow automatic pulling.
From: Junio C Hamano @ 2009-01-11 1:34 UTC (permalink / raw)
To: Fabian Franz; +Cc: git, j.sixt, hjemli
In-Reply-To: <1231553410-7541-4-git-send-email-git@fabian-franz.de>
Fabian Franz <git@fabian-franz.de> writes:
> When the track parameter is set to +<branch>, on update command a
> new branch is created tracking the remote branch (if it does not
> yet exist). Then the branch is checked out and git-pull is run.
Usually a "+" in front of a refspec means "allow forcing non-ff update",
but here it means completely different thing. The syntax is confusing.
But that aside, I do not know if the semantics of this patch makes sense
to begin with.
Should this really be a persistent property of a submodule? With your
patch, this is always triggered for modules you did "--track +branch" when
you added them, but (1) not for others you did not say "+" when you added,
and (2) you cannot disable the auto-pull for the ones you did even if you
wanted to. It feels it might be better to make it a property of one
particular invocation of "update" action, and more generally, the entire
series feels like a very specific hack that is meant to cover/impose your
particular workflow (and not others').
Don't get me wrong. I do not see any problem in supporing it well, if
that particular workflow is a good workflow and a generally applicable
one.
But I do not think it is documented well enough. "--track creates one
configuration in the .git/config file" and "update always checks out the
tip of the named branch if such configuration is set" may be fine
descriptions of what each piece does. The readers will be left puzzled
without an explanation of the bigger picture to understand why it is
(sometimes) a good idea to make these pieces do what they do.
^ permalink raw reply
* Re: [PATCH v2] make diff --color-words customizable
From: Junio C Hamano @ 2009-01-11 1:34 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Thomas Rast, git, Teemu Likonen
In-Reply-To: <alpine.DEB.1.00.0901101239550.30769@pacific.mpi-cbg.de>
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> Well, the thing I tried to hint at: it is not good to have a monster
> patch, as nobody will review it.
>
> In your case, I imagine it would be much easier to get reviewers if you
> had
>
> patch 1/4 refactor color-words to allow for 0-character word
> boundaries
> patch 2/4 allow regular expressions to define what makes a word
> patch 3/4 add option to specify word boundary regexps via
> attributes
> patch 4/4 test word boundary regexps
>
> And I admit that I documented the code lousily, but that does not mean
> that you should repeat that mistake.
Sounds like a reasonable request. Also I am seeing:
diff.c: In function 'scan_word_boundaries':
diff.c:512: warning: enumeration value 'DIFF_WORD_UNDEF' not handled in switch
from this part of the code:
for (i = 0; i < len; i++) {
switch (buf->boundaries[i]) {
case DIFF_WORD_BODY:
*p++ = text[i];
break;
case DIFF_WORD_END:
*p++ = text[i];
*p++ = '\n'; /* insert an artificial newline */
break;
case DIFF_WORD_SPACE:
*p++ = '\n';
break;
case DIFF_WORD_SKIP:
/* nothing */
break;
}
}
^ permalink raw reply
* Re: [BUG PATCH RFC] mailinfo: correctly handle multiline 'Subject:' header
From: Junio C Hamano @ 2009-01-11 1:54 UTC (permalink / raw)
To: Kirill Smelkov; +Cc: Alexander Potashev, git
In-Reply-To: <20090108231135.GB4185@roro3>
Kirill Smelkov <kirr@landau.phys.spbu.ru> writes:
> [but I'm not sure whether testresult with Nathaniel Borenstein
> (םולש ןב ילטפנ) is correct -- see rfc2047-info-0004]
> ...
> diff --git a/t/t5100/rfc2047-info-0004 b/t/t5100/rfc2047-info-0004
> new file mode 100644
> index 0000000..850f831
> --- /dev/null
> +++ b/t/t5100/rfc2047-info-0004
> @@ -0,0 +1,5 @@
> +Author: Nathaniel Borenstein
> + (םולש ןב ילטפנ)
> +Email: nsb@thumper.bellcore.com
> +Subject: Test of new header generator
> +
That does look wrong. If you can fix this, please do so; otherwise please
mark the test that deals with this entry with test_expect_failure, until
somebody else does.
^ permalink raw reply
* Re: TortoiseGit
From: Frank Li @ 2009-01-11 1:57 UTC (permalink / raw)
To: Victor Westmann; +Cc: git
In-Reply-To: <4d5bc4bb0901101446j29d147c3pb885824e8d0d484f@mail.gmail.com>
Great!
I have not to enabled multi-language support yet.
I will enable it soon.
When it is ready, I will tell you.
Thank you very much
^ permalink raw reply
* Re: TortoiseGit
From: Victor Westmann @ 2009-01-11 3:06 UTC (permalink / raw)
To: Frank Li; +Cc: git
In-Reply-To: <1976ea660901101757n3f18c0e4m2fa275ab7b961a2c@mail.gmail.com>
Frank,
hi. Youre more than welcome! I have helped in the translation of
TortoiseCVS and TortoiseSVN so I thought it would be nice to help in
this project too! :)
Cheers.
--
[]s
Victor Westmann
Steve Martin - "I like a woman with a head on her shoulders. I hate necks."
^ permalink raw reply
* Re: TortoiseGit
From: Frank Li @ 2009-01-11 3:09 UTC (permalink / raw)
To: Victor Westmann; +Cc: git
In-Reply-To: <4d5bc4bb0901101906g4f64100dyc140ebd62d51fdf1@mail.gmail.com>
Absolutely yes!
Welcome translation!
^ permalink raw reply
* Re: [PATCH/RFC v5 1/1] more cache effective symlink/directory detection
From: Kjetil Barvik @ 2009-01-11 8:26 UTC (permalink / raw)
To: Junio C Hamano; +Cc: René Scharfe, git
In-Reply-To: <7vljtivfao.fsf@gitster.siamese.dyndns.org>
Junio C Hamano <gitster@pobox.com> writes:
> René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
>
>> Kjetil Barvik schrieb:
>>> - Also introduce a 'void clear_lstat_cache(void)' function, which
>>> should be used to clean the cache before usage. If for instance,
>>> you have changed the types of directories which should be cached,
>>> the cache could contain a path which was not wanted.
>>
>> Is it possible to make the cache detect these situations automatically
>> by saving track_flags along with the cache contents? Not having to
>> clear the cache manually would be a major feature.
>
>> Also, it's probably worth to split this patch up again. First switching
>> to your improved implementation of has_symlink_leading_path(), then
>> introducing has_symlink_or_noent_leading_path() and finally adding
>> LSTAT_FULLPATH and the fourth parameter of lstat_cache() etc. and using
>> this feature in entry.c seems like a nice incremental progression.
>
> Both are reasonable suggestions. Thanks.
Ok! Thanks for comments! Version 6 will follow shortly!
-- kjetil
^ permalink raw reply
* What's cooking in git.git (Jan 2009, #02; Sun, 11)
From: Junio C Hamano @ 2009-01-11 9:51 UTC (permalink / raw)
To: git
Here are the topics that have been cooking. Commits prefixed
with '-' are only in 'pu' while commits prefixed with '+' are
in 'next'. The ones marked with '.' do not appear in any of the branches,
but I am still holding onto them.
The topics list the commits in reverse chronological order. The topics
meant to be merged to the maintenance series have "maint-" in their names.
----------------------------------------------------------------
[New Topics]
* rs/fgrep (Sat Jan 10 00:18:34 2009 +0100) 2 commits
+ grep: don't call regexec() for fixed strings
+ grep -w: forward to next possible position after rejected match
* lt/zlib-wrap-xprm (Wed Jan 7 19:54:47 2009 -0800) 1 commit
- Wrap inflateInit to retry allocation after releasing pack memory
Need to clean up the log message, perhaps rebase it to maint-1.6.0 and
start cooking in 'next'.
* jc/maint-format-patch (Sat Jan 10 12:41:33 2009 -0800) 1 commit
+ format-patch: show patch text for the root commit
* tr/maint-no-index-fixes (Wed Jan 7 12:15:30 2009 +0100) 3 commits
+ diff --no-index -q: fix endless loop
+ diff --no-index: test for pager after option parsing
+ diff: accept -- when using --no-index
* gb/gitweb-opml (Fri Jan 2 13:49:30 2009 +0100) 2 commits
- gitweb: suggest name for OPML view
- gitweb: don't use pathinfo for global actions
* mh/maint-commit-color-status (Thu Jan 8 19:53:05 2009 +0100) 2 commits
- git-status -v: color diff output when color.ui is set
- git-commit: color status output when color.ui is set
* ks/maint-mailinfo-folded (Thu Jan 8 01:43:42 2009 +0300) 1 commit
- mailinfo: correctly handle multiline 'Subject:' header
* js/patience-diff (Thu Jan 1 17:39:37 2009 +0100) 3 commits
- bash completions: Add the --patience option
- Introduce the diff option '--patience'
- Implement the patience diff algorithm
All of the above 'pu' topics are ready for 'next'.
* ap/clone-into-empty (Fri Jan 9 02:24:23 2009 +0300) 2 commits
- Use is_pseudo_dir_name everywhere
- Allow cloning to an existing empty directory
There is an updated patch that only refactors the repeated code to check
if a dirent is dot or dot-dot posted, which I should have picked up to
replace these but I haven't yet (the "clone into empty" can and should
build on top of it).
----------------------------------------------------------------
[Stalled and may need help and prodding to go forward]
* ds/uintmax-config (Mon Nov 3 09:14:28 2008 -0900) 1 commit
- autoconf: Enable threaded delta search when pthreads are supported
This automatically enables threaded delta search code when autoconf
detects pthreads are usable. I haven't heard neither positive nor
negative comments from minority platforms that might be harmed, but
this feels like the right thing to do, so perhaps the best course of
action is to merge this down to 'master' and see if anybody screams.
* jc/blame (Wed Jun 4 22:58:40 2008 -0700) 2 commits
+ blame: show "previous" information in --porcelain/--incremental
format
+ git-blame: refactor code to emit "porcelain format" output
This gives Porcelains (like gitweb) the information on the commit _before_
the one that the final blame is laid on, which should save them one
rev-parse to dig further. The line number in the "previous" information
may need refining, and sanity checking code for reference counting may
need to be resurrected before this can move forward.
----------------------------------------------------------------
[Actively cooking]
* mv/apply-parse-opt (Fri Jan 9 22:21:36 2009 -0800) 2 commits
+ Resurrect "git apply --flags -" to read from the standard input
+ parse-opt: migrate builtin-apply.
* rs/maint-shortlog-foldline (Tue Jan 6 21:41:06 2009 +0100) 1 commit
+ shortlog: handle multi-line subjects like log --pretty=oneline et.
al. do
* tr/rebase-root (Fri Jan 2 23:28:29 2009 +0100) 4 commits
- rebase: update documentation for --root
- rebase -i: learn to rebase root commit
- rebase: learn to rebase root commit
- rebase -i: execute hook only after argument checking
I should be able to find time to read this over again and merge to
'next' sometime this week.
* as/autocorrect-alias (Sun Jan 4 18:16:01 2009 +0100) 1 commit
+ git.c: make autocorrected aliases work
* js/notes (Sat Dec 20 13:06:03 2008 +0100) 4 commits
- Add an expensive test for git-notes
- Speed up git notes lookup
- Add a script to edit/inspect notes
- Introduce commit notes
* sc/gitweb-category (Fri Dec 12 00:45:12 2008 +0100) 3 commits
- gitweb: Optional grouping of projects by category
- gitweb: Split git_project_list_body in two functions
- gitweb: Modularized git_get_project_description to be more generic
* gb/gitweb-patch (Thu Dec 18 08:13:19 2008 +0100) 4 commits
- gitweb: link to patch(es) view in commit(diff) and (short)log view
- gitweb: add patches view
- gitweb: change call pattern for git_commitdiff
- gitweb: add patch view
----------------------------------------------------------------
[Graduated to "master"]
* mh/maint-sendmail-cc-doc (Mon Dec 29 00:37:25 2008 +0100) 1 commit
+ doc/git-send-email: mention sendemail.cc config variable
* rs/diff-ihc (Sun Dec 28 19:45:32 2008 +0100) 1 commit
+ diff: add option to show context between close hunks
* js/maint-merge-recursive-r-d-conflict (Mon Dec 22 23:10:20 2008 +0100) 1 commit
+ merge-recursive: mark rename/delete conflict as unmerged
* mk/gitweb-feature (Mon Dec 15 22:16:19 2008 -0800) 1 commit
+ gitweb: unify boolean feature subroutines
* cb/merge-recursive-fix (Mon Dec 15 02:41:24 2008 -0800) 3 commits
+ Merge branch 'cb/maint-merge-recursive-fix' into cb/merge-
recursive-fix
+ merge-recursive: do not clobber untracked working tree garbage
+ modify/delete conflict resolution overwrites untracked file
* cb/maint-merge-recursive-fix (Sun Dec 14 19:40:09 2008 -0800) 2 commits
+ merge-recursive: do not clobber untracked working tree garbage
+ modify/delete conflict resolution overwrites untracked file
* wp/add-p-goto (Thu Dec 4 10:22:40 2008 +0000) 2 commits
+ Add 'g' command to go to a hunk
+ Add subroutine to display one-line summary of hunks
* jn/gitweb-blame (Thu Dec 11 01:33:29 2008 +0100) 3 commits
+ gitweb: cache $parent_commit info in git_blame()
+ gitweb: A bit of code cleanup in git_blame()
+ gitweb: Move 'lineno' id from link to row element in git_blame
* mv/um-pdf (Wed Dec 10 23:44:50 2008 +0100) 1 commit
+ Add support for a pdf version of the user manual
* kk/maint-http-push (Tue Dec 23 11:31:15 2008 +0300) 1 commit
+ http-push: support full URI in handle_remote_ls_ctx()
----------------------------------------------------------------
[Will merge to "master" soon]
* nd/grep-assume-unchanged (Sat Dec 27 15:21:03 2008 +0700) 2 commits
+ grep: grep cache entries if they are "assume unchanged"
+ grep: support --no-ext-grep to test builtin grep
* as/maint-shortlog-cleanup (Tue Dec 30 22:01:44 2008 +0100) 1 commit
+ builtin-shortlog.c: use string_list_append(), and don't strdup
unnecessarily
* jc/maint-ls-tree (Wed Dec 31 19:00:50 2008 +0900) 2 commits
+ Document git-ls-tree --full-tree
+ ls-tree: add --full-tree option
* js/bundle-tags (Fri Jan 2 19:08:46 2009 +0100) 1 commit
+ bundle: allow rev-list options to exclude annotated tags
* js/add-not-submodule (Fri Jan 2 19:08:40 2009 +0100) 1 commit
+ git add: do not add files from a submodule
* pb/maint-git-pm-false-dir (Mon Dec 29 01:25:00 2008 +0100) 1 commit
+ Git.pm: correctly handle directory name that evaluates to "false"
* pj/maint-ldflags (Sun Jan 4 21:27:41 2009 -0500) 1 commit
+ configure clobbers LDFLAGS
* fe/cvsserver (Fri Jan 2 16:40:14 2009 +0100) 2 commits
+ cvsserver: change generation of CVS author names
+ cvsserver: add option to configure commit message
* js/maint-bisect-gitk (Fri Jan 2 19:08:00 2009 +0100) 1 commit
+ bisect view: call gitk if Cygwin's SESSIONNAME variable is set
* np/no-loosen-prune-expire-now (Tue Dec 30 14:45:11 2008 -0500) 1 commit
+ objects to be pruned immediately don't have to be loosened
* cb/maint-unpack-trees-absense (Thu Jan 1 21:54:33 2009 +0100) 3 commits
+ unpack-trees: remove redundant path search in verify_absent
+ unpack-trees: fix path search bug in verify_absent
+ unpack-trees: handle failure in verify_absent
* mc/cd-p-pwd (Tue Dec 30 07:10:24 2008 -0800) 1 commit
+ git-sh-setup: Fix scripts whose PWD is a symlink to a work-dir on
OS X
* mh/cherry-default (Thu Jan 1 22:56:29 2009 +0100) 2 commits
+ Documentation: clarify which parameters are optional to git-cherry
+ git-cherry: make <upstream> parameter optional
----------------------------------------------------------------
[Will drop]
* as/commit-signoff (Mon Dec 29 12:16:45 2008 +0100) 1 commit
- [WIP] Add a commit.signoff configuration option to always use --
signoff in commit
The semantics when "git commit" was used as a backend for other actions
such as rebase and cherry-pick was unclear.
----------------------------------------------------------------
[On Hold]
* jk/renamelimit (Sat May 3 13:58:42 2008 -0700) 1 commit
- diff: enable "too large a rename" warning when -M/-C is explicitly
asked for
This would be the right thing to do for command line use,
but gitk will be hit due to tcl/tk's limitation, so I am holding
this back for now.
* jc/stripspace (Sun Mar 9 00:30:35 2008 -0800) 6 commits
- git-am --forge: add Signed-off-by: line for the author
- git-am: clean-up Signed-off-by: lines
- stripspace: add --log-clean option to clean up signed-off-by:
lines
- stripspace: use parse_options()
- Add "git am -s" test
- git-am: refactor code to add signed-off-by line for the committer
^ permalink raw reply
* Re: [PATCH 3/6] Glean libexec path from argv[0] for git-upload-pack and git-receive-pack.
From: Steffen Prohaska @ 2009-01-11 10:04 UTC (permalink / raw)
To: Junio C Hamano
Cc: Johannes Schindelin, Git Mailing List, Johannes Sixt,
Steve Haslam
In-Reply-To: <7viqomx5iq.fsf@gitster.siamese.dyndns.org>
On Jan 10, 2009, at 9:36 PM, Junio C Hamano wrote:
> Steffen Prohaska <prohaska@zib.de> writes:
>
>> On Jan 10, 2009, at 3:34 PM, Johannes Schindelin wrote:
>>
>>> Logically, and to avoid committing a broken revision, 1/6 should
>>> come
>>> last, methinks.
>>
>> RUNTIME_PREFIX is not defined before 6/6. But I agree,
>> 1/6 should probably be moved after 5/6.
>>
> Hmm, I actually was thinking about applying that (and that one only)
> early
> to my tree, to make sure it is regression-free.
Does "early" mean that you want to wait and see how the discussion
about the other patches evolves before you consider applying them,
or does "and that one only" mean that you tend to not apply the
other patches at all?
Steffen
^ permalink raw reply
* Re: [PATCH 3/6] Glean libexec path from argv[0] for git-upload-pack and git-receive-pack.
From: Junio C Hamano @ 2009-01-11 10:21 UTC (permalink / raw)
To: Steffen Prohaska
Cc: Johannes Schindelin, Git Mailing List, Johannes Sixt,
Steve Haslam
In-Reply-To: <E976B246-AD14-4B03-B204-F6A1014071DF@zib.de>
Steffen Prohaska <prohaska@zib.de> writes:
>> Hmm, I actually was thinking about applying that (and that one only)
>> early
>> to my tree, to make sure it is regression-free.
>
> Does "early" mean that you want to wait and see how the discussion
> about the other patches evolves before you consider applying them,
> or does "and that one only" mean that you tend to not apply the
> other patches at all?
My initial impression after reading 1/6 was that no matter how the actual
runtime prefix detection logic that is implemented in the later parts of
the series for particular platform will turn out to be, the update to the
Makefile that is done by 1/6 won't have to change. If I apply 1/6 first
without applying anything else, we can make sure that it would not regress
for Unix people (and catch regressions early if any), while Windows people
polish the platform specific parts of the implementation in the later
parts of the series that can be replaced.
Because changes to Makefile variables tend to have unexpected side effects
(people have their own definition to override them in their build
procedures and you can easily break them unless you are careful), I wanted
to make sure the common part is solid before waiting for the other part.
But if you think it is better not to apply any one, until other parts
mature, it is Ok by me, too.
^ permalink raw reply
* [PATCH v3 0/4] customizable --color-words
From: Thomas Rast @ 2009-01-11 10:27 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Johannes Schindelin, Teemu Likonen
In-Reply-To: <7vr63atykr.fsf@gitster.siamese.dyndns.org>
Johannes Schindelin wrote:
> On Sat, 10 Jan 2009, Thomas Rast wrote:
> > Johannes Schindelin wrote:
> > > So I still find your patch way too large
The bad news is... it just got bigger ;-)
> In your case, I imagine it would be much easier to get reviewers if you
> had
>
> patch 1/4 refactor color-words to allow for 0-character word
> boundaries
> patch 2/4 allow regular expressions to define what makes a word
So here's a 4-patch series. I put the first split in a different
place than you suggested, however. I couldn't see a good way to
separate empty boundaries from regex splitting in such a way that the
first half can be exercised (is not just dead code). 1/4 basically
just rearranges code a bit and should be a real no-op patch.
Junio C Hamano wrote:
> diff.c: In function 'scan_word_boundaries':
> diff.c:512: warning: enumeration value 'DIFF_WORD_UNDEF' not handled in sw
Thanks, added a case to test for this.
There is one other minor semantic change in 2/4: the error reporting
in case your regex matched "foo\nbar" now says "before 'bar'" instead
of "near '\nbar'". Other than that, there are only a bunch of added
comments when comparing the result of all four patches with v2.
Thomas Rast (4):
word diff: comments, preparations for regex customization
word diff: customizable word splits
word diff: make regex configurable via attributes
word diff: test customizable word splits
Documentation/diff-options.txt | 18 +++-
Documentation/gitattributes.txt | 21 +++
diff.c | 282 ++++++++++++++++++++++++++++++++++++---
diff.h | 1 +
t/t4033-diff-color-words.sh | 90 +++++++++++++
userdiff.c | 27 +++-
userdiff.h | 1 +
7 files changed, 413 insertions(+), 27 deletions(-)
create mode 100755 t/t4033-diff-color-words.sh
^ permalink raw reply
* [PATCH v3 1/4] word diff: comments, preparations for regex customization
From: Thomas Rast @ 2009-01-11 10:27 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Johannes Schindelin, Teemu Likonen
In-Reply-To: <7vr63atykr.fsf@gitster.siamese.dyndns.org>
This reorganizes the code for diff --color-words in a way that will be
convenient for the next patch, without changing any of the semantics.
The new variables are not used yet except for their default state.
We also add some comments on the workings of diff_words_show() and
associated helper routines.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---
diff.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++++------------
1 files changed, 69 insertions(+), 17 deletions(-)
diff --git a/diff.c b/diff.c
index d235482..f274bf5 100644
--- a/diff.c
+++ b/diff.c
@@ -316,11 +316,21 @@ static int fill_mmfile(mmfile_t *mf, struct diff_filespec *one)
return 0;
}
+/* unused right now */
+enum diff_word_boundaries {
+ DIFF_WORD_UNDEF,
+ DIFF_WORD_BODY,
+ DIFF_WORD_END,
+ DIFF_WORD_SPACE,
+ DIFF_WORD_SKIP
+};
+
struct diff_words_buffer {
mmfile_t text;
long alloc;
long current; /* output pointer */
int suppressed_newline;
+ enum diff_word_boundaries *boundaries;
};
static void diff_words_append(char *line, unsigned long len,
@@ -339,16 +349,23 @@ static void diff_words_append(char *line, unsigned long len,
struct diff_words_data {
struct diff_words_buffer minus, plus;
FILE *file;
+ regex_t *word_regex; /* currently unused */
};
-static void print_word(FILE *file, struct diff_words_buffer *buffer, int len, int color,
+/*
+ * Print 'len' characters from the "real" diff data in 'buffer'. Also
+ * returns how many characters were printed (currently always 'len').
+ * With 'suppress_newline', we remember a final newline instead of
+ * printing it.
+ */
+static int print_word(FILE *file, struct diff_words_buffer *buffer, int len, int color,
int suppress_newline)
{
const char *ptr;
int eol = 0;
if (len == 0)
- return;
+ return len;
ptr = buffer->text.ptr + buffer->current;
buffer->current += len;
@@ -368,18 +385,30 @@ static void print_word(FILE *file, struct diff_words_buffer *buffer, int len, in
else
putc('\n', file);
}
+
+ /* we need to return how many chars to skip on the other side,
+ * so account for the (held off) \n */
+ return len+eol;
}
+/*
+ * Callback for word diff output
+ */
static void fn_out_diff_words_aux(void *priv, char *line, unsigned long len)
{
struct diff_words_data *diff_words = priv;
if (diff_words->minus.suppressed_newline) {
+ /* We completely drop a suppressed newline on the
+ * minus side, if it is immediately followed by a plus
+ * side output. This formats a word change right
+ * before the end of line correctly */
if (line[0] != '+')
putc('\n', diff_words->file);
diff_words->minus.suppressed_newline = 0;
}
+ /* account for the [+- ] inserted by the line diff */
len--;
switch (line[0]) {
case '-':
@@ -391,8 +420,10 @@ static void fn_out_diff_words_aux(void *priv, char *line, unsigned long len)
&diff_words->plus, len, DIFF_FILE_NEW, 0);
break;
case ' ':
- print_word(diff_words->file,
- &diff_words->plus, len, DIFF_PLAIN, 0);
+ len = print_word(diff_words->file,
+ &diff_words->plus, len, DIFF_PLAIN, 0);
+ /* skip the characters that were printed on
+ * the other side, too */
diff_words->minus.current += len;
break;
}
@@ -409,22 +440,37 @@ static void diff_words_show(struct diff_words_data *diff_words)
memset(&xpp, 0, sizeof(xpp));
memset(&xecfg, 0, sizeof(xecfg));
- minus.size = diff_words->minus.text.size;
- minus.ptr = xmalloc(minus.size);
- memcpy(minus.ptr, diff_words->minus.text.ptr, minus.size);
- for (i = 0; i < minus.size; i++)
- if (isspace(minus.ptr[i]))
- minus.ptr[i] = '\n';
- diff_words->minus.current = 0;
- plus.size = diff_words->plus.text.size;
- plus.ptr = xmalloc(plus.size);
- memcpy(plus.ptr, diff_words->plus.text.ptr, plus.size);
- for (i = 0; i < plus.size; i++)
- if (isspace(plus.ptr[i]))
- plus.ptr[i] = '\n';
+ /* currently always true */
+ if (!diff_words->word_regex) {
+ /*
+ * "Simple" word diff: replace all space characters
+ * with a newline.
+ *
+ * This groups together "words" of nonspaces on a line
+ * each, which we then diff using the normal line-diff
+ * mechanism. It also has the nice property that
+ * character counts/offsets stay the same.
+ */
+ minus.size = diff_words->minus.text.size;
+ minus.ptr = xmalloc(minus.size);
+ memcpy(minus.ptr, diff_words->minus.text.ptr, minus.size);
+ for (i = 0; i < minus.size; i++)
+ if (isspace(minus.ptr[i]))
+ minus.ptr[i] = '\n';
+
+ plus.size = diff_words->plus.text.size;
+ plus.ptr = xmalloc(plus.size);
+ memcpy(plus.ptr, diff_words->plus.text.ptr, plus.size);
+ for (i = 0; i < plus.size; i++)
+ if (isspace(plus.ptr[i]))
+ plus.ptr[i] = '\n';
+ }
+ diff_words->minus.current = 0;
diff_words->plus.current = 0;
+ /* we want a minimal diff with enough context to run
+ * everything into a single hunk */
xpp.flags = XDF_NEED_MINIMAL;
xecfg.ctxlen = diff_words->minus.alloc + diff_words->plus.alloc;
xdi_diff_outf(&minus, &plus, fn_out_diff_words_aux, diff_words,
@@ -432,7 +478,12 @@ static void diff_words_show(struct diff_words_data *diff_words)
free(minus.ptr);
free(plus.ptr);
diff_words->minus.text.size = diff_words->plus.text.size = 0;
+ /* these two are currently free(NULL) */
+ free(diff_words->minus.boundaries);
+ free(diff_words->plus.boundaries);
+ /* do not forget about a possible final newline that was held
+ * back */
if (diff_words->minus.suppressed_newline) {
putc('\n', diff_words->file);
diff_words->minus.suppressed_newline = 0;
@@ -461,6 +512,7 @@ static void free_diff_words_data(struct emit_callback *ecbdata)
free (ecbdata->diff_words->minus.text.ptr);
free (ecbdata->diff_words->plus.text.ptr);
+ free(ecbdata->diff_words->word_regex);
free(ecbdata->diff_words);
ecbdata->diff_words = NULL;
}
--
1.6.1.269.g0769
^ permalink raw reply related
* [PATCH v3 2/4] word diff: customizable word splits
From: Thomas Rast @ 2009-01-11 10:27 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Johannes Schindelin, Teemu Likonen
In-Reply-To: <7vr63atykr.fsf@gitster.siamese.dyndns.org>
Allows for user-configurable word splits when using --color-words.
This can make the diff more readable if the regex is configured
according to the language of the file.
Each non-overlapping match of the regex is a word; everything in
between is whitespace. We disallow matching the empty string (because
it results in an endless loop) or a newline (breaks color escapes and
interacts badly with the input coming from the usual line diff). To
help the user, we set REG_NEWLINE so that [^...] and . do not match
newlines.
--color-words works (and always worked) by splitting words onto one
line each, and using the normal line-diff machinery to get a word
diff. Since we cannot reuse the current approach of simply
overwriting uninteresting characters with '\n', we insert an
artificial '\n' at the end of each detected word. Its presence must
be tracked so that we can distinguish artificial from source newlines.
Insertion of spaces is somewhat subtle. We echo a "context" space
twice (once on each side of the diff) if it follows directly after a
word. While this loses a tiny bit of accuracy, it runs together long
sequences of changed word into one removed and one added block, making
the diff much more readable. This feature also means that the
splitting regex '\S+' results in the same output as the original code.
The existing code still stays in place in case no regex is provided,
for performance.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---
Documentation/diff-options.txt | 16 +++-
diff.c | 196 +++++++++++++++++++++++++++++++++++++++-
diff.h | 1 +
3 files changed, 206 insertions(+), 7 deletions(-)
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 671f533..6152d5b 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -91,8 +91,20 @@ endif::git-format-patch[]
Turn off colored diff, even when the configuration file
gives the default to color output.
---color-words::
- Show colored word diff, i.e. color words which have changed.
+--color-words[=<regex>]::
+ Show colored word diff, i.e., color words which have changed.
+ By default, a new word only starts at whitespace, so that a
+ 'word' is defined as a maximal sequence of non-whitespace
+ characters. The optional argument <regex> can be used to
+ configure this.
++
+The <regex> must be an (extended) regular expression. When set, every
+non-overlapping match of the <regex> is considered a word. (Regular
+expression semantics ensure that quantifiers grab a maximal sequence
+of characters.) Anything between these matches is considered
+whitespace and ignored for the purposes of finding differences. You
+may want to append `|\S` to your regular expression to make sure that
+it matches all non-whitespace characters.
--no-renames::
Turn off rename detection, even when the configuration
diff --git a/diff.c b/diff.c
index f274bf5..badaea6 100644
--- a/diff.c
+++ b/diff.c
@@ -316,7 +316,21 @@ static int fill_mmfile(mmfile_t *mf, struct diff_filespec *one)
return 0;
}
-/* unused right now */
+/*
+ * We use these to save the word boundaries:
+ *
+ * - BODY and END track the extent of a word; after END, an artificial
+ * newline will be inserted.
+ *
+ * - SPACE means the character is a space, and will be replaced by a
+ * newline. If a space follows immediately after END, it is flagged
+ * SKIP instead, so that two words with exactly one space in between
+ * end up with only one newline between them.
+ *
+ * - The boundary array is overallocated by one, and the spare element
+ * is flagged UNDEF to allow peeking over the end of a word to see
+ * if the next element is a SKIP.
+ */
enum diff_word_boundaries {
DIFF_WORD_UNDEF,
DIFF_WORD_BODY,
@@ -349,12 +363,12 @@ static void diff_words_append(char *line, unsigned long len,
struct diff_words_data {
struct diff_words_buffer minus, plus;
FILE *file;
- regex_t *word_regex; /* currently unused */
+ regex_t *word_regex;
};
/*
* Print 'len' characters from the "real" diff data in 'buffer'. Also
- * returns how many characters were printed (currently always 'len').
+ * returns how many characters were printed.
* With 'suppress_newline', we remember a final newline instead of
* printing it.
*/
@@ -368,8 +382,27 @@ static int print_word(FILE *file, struct diff_words_buffer *buffer, int len, int
return len;
ptr = buffer->text.ptr + buffer->current;
+
+ if (buffer->boundaries
+ && (buffer->boundaries[buffer->current] == DIFF_WORD_BODY
+ || buffer->boundaries[buffer->current] == DIFF_WORD_END)) {
+ /* drop the artificial newline */
+ len--;
+ /* we still have len>0 because it is a word, and
+ * scan_word_boundaries() disallows words of length 0. */
+ }
+
buffer->current += len;
+ /* Peek past the end (this is safe because we overallocated)
+ * to check if the next character was a skipped space. If so,
+ * we put it together with the word. */
+ if (buffer->boundaries
+ && buffer->boundaries[buffer->current] == DIFF_WORD_SKIP) {
+ buffer->current++;
+ len++;
+ }
+
if (ptr[len - 1] == '\n') {
eol = 1;
len--;
@@ -429,6 +462,141 @@ static void fn_out_diff_words_aux(void *priv, char *line, unsigned long len)
}
}
+/*
+ * Fancy word splitting by regex
+ *
+ * We search for all non-overlapping matches of 'pattern' in the
+ * 'buf', and define every match as a (separate) word and all
+ * unmatched characters as whitespace.
+ *
+ * Then we build a new mmfile where each word is on a line of its own.
+ * Two of these can then be line-diffed to find the word differences
+ * between the original buffers. Unlike the normal word diff (see
+ * diff_words_show() below), the transformation does not preserve
+ * character counts, so we need to keep tracking information in
+ * buf->boundaries for later use by print_word().
+ */
+static void scan_word_boundaries(regex_t *pattern, struct diff_words_buffer *buf,
+ mmfile_t *mmfile)
+{
+ char *text = buf->text.ptr;
+ int len = buf->text.size;
+ int i = 0;
+ /* counts how many extra characters will be inserted into the
+ * mmfile */
+ int count = 0;
+ int ret;
+ regmatch_t matches[1];
+ int offset, wordlen;
+ char *strz, *p;
+
+ /* overallocate by 1 so we can safely peek past the end for a
+ * SKIP, see print_word() */
+ buf->boundaries = xmalloc((len+1) * sizeof(enum diff_word_boundaries));
+ buf->boundaries[len] = DIFF_WORD_UNDEF;
+
+ if (!text) {
+ mmfile->ptr = NULL;
+ mmfile->size = 0;
+ return;
+ }
+
+ /* we unfortunately need a null-terminated copy for regexec */
+ strz = xmalloc(len+1);
+ memcpy(strz, text, len);
+ strz[len] = '\0';
+
+ while (i < len) {
+ /* iteratively match the regex against the rest of the
+ * input string to find the next word */
+ ret = regexec(pattern, strz+i, 1, matches, 0);
+ if (ret == REG_NOMATCH) {
+ /* The rest is whitespace. The first space
+ * character is flagged SKIP unless there was
+ * no preceding text at all */
+ if (i > 0 && i < len) {
+ buf->boundaries[i++] = DIFF_WORD_SKIP;
+ count--; /* SKIP characters have no output */
+ }
+ while (i < len)
+ buf->boundaries[i++] = DIFF_WORD_SPACE;
+ break;
+ }
+
+ offset = matches[0].rm_so;
+
+ /* everything up to the next word is whitespace, using
+ * the same SKIP rule as above */
+ if (offset > 0 && i > 0) {
+ buf->boundaries[i++] = DIFF_WORD_SKIP;
+ count--;
+ offset--;
+ }
+ while (offset-- > 0)
+ buf->boundaries[i++] = DIFF_WORD_SPACE;
+
+ /* rm_eo is the first character after the match, so
+ * this is indeed the number of characters matched */
+ wordlen = matches[0].rm_eo - matches[0].rm_so;
+
+ /* all but the last character are BODY */
+ while (wordlen > 1) {
+ if (strz[i] == '\n')
+ die("word regex matched a newline before '%s'",
+ strz+i+1);
+ buf->boundaries[i++] = DIFF_WORD_BODY;
+ wordlen--;
+ }
+ /* the last character is END, we will insert an extra
+ * '\n' after it */
+ if (wordlen > 0) {
+ if (strz[i] == '\n')
+ die("word regex matched a newline before '%s'",
+ strz+i+1);
+ buf->boundaries[i++] = DIFF_WORD_END;
+ count++;
+ } else {
+ /* this would cause an endless loop, so panic */
+ die("word regex matched the empty string at '%s'",
+ strz+i);
+ }
+ }
+
+ free(strz);
+
+ /* now build the mmfile. there will be 'count' more characters
+ * than in the original */
+ mmfile->size = len + count;
+ mmfile->ptr = xmalloc(mmfile->size);
+ p = mmfile->ptr;
+ for (i = 0; i < len; i++) {
+ switch (buf->boundaries[i]) {
+ case DIFF_WORD_BODY: /* copy over */
+ *p++ = text[i];
+ break;
+ case DIFF_WORD_END: /* copy and insert an artificial newline */
+ *p++ = text[i];
+ *p++ = '\n';
+ break;
+ case DIFF_WORD_SPACE: /* replace by '\n' */
+ *p++ = '\n';
+ break;
+ case DIFF_WORD_SKIP:
+ /* Ignore. Since the character right before
+ * this one is always an END, another way to
+ * look at it is that we avoid duplicate END
+ * newlines that are already provided by
+ * SPACE */
+ break;
+ case DIFF_WORD_UNDEF:
+ /* can't happen, but silences a warning */
+ die("can't happen, send test case to git@vger.kernel.org");
+ break;
+ }
+ }
+}
+
+
/* this executes the word diff on the accumulated buffers */
static void diff_words_show(struct diff_words_data *diff_words)
{
@@ -441,7 +609,6 @@ static void diff_words_show(struct diff_words_data *diff_words)
memset(&xpp, 0, sizeof(xpp));
memset(&xecfg, 0, sizeof(xecfg));
- /* currently always true */
if (!diff_words->word_regex) {
/*
* "Simple" word diff: replace all space characters
@@ -465,6 +632,13 @@ static void diff_words_show(struct diff_words_data *diff_words)
for (i = 0; i < plus.size; i++)
if (isspace(plus.ptr[i]))
plus.ptr[i] = '\n';
+ } else {
+ /* Configurable word diff with a regex. See
+ * scan_word_boundaries() above. */
+ scan_word_boundaries(diff_words->word_regex,
+ &diff_words->minus, &minus);
+ scan_word_boundaries(diff_words->word_regex,
+ &diff_words->plus, &plus);
}
diff_words->minus.current = 0;
diff_words->plus.current = 0;
@@ -478,7 +652,6 @@ static void diff_words_show(struct diff_words_data *diff_words)
free(minus.ptr);
free(plus.ptr);
diff_words->minus.text.size = diff_words->plus.text.size = 0;
- /* these two are currently free(NULL) */
free(diff_words->minus.boundaries);
free(diff_words->plus.boundaries);
@@ -1535,6 +1708,15 @@ static void builtin_diff(const char *name_a,
ecbdata.diff_words =
xcalloc(1, sizeof(struct diff_words_data));
ecbdata.diff_words->file = o->file;
+ if (o->word_regex) {
+ ecbdata.diff_words->word_regex = (regex_t *)
+ xmalloc(sizeof(regex_t));
+ if (regcomp(ecbdata.diff_words->word_regex,
+ o->word_regex,
+ REG_EXTENDED|REG_NEWLINE))
+ die ("Invalid regular expression: %s",
+ o->word_regex);
+ }
}
xdi_diff_outf(&mf1, &mf2, fn_out_consume, &ecbdata,
&xpp, &xecfg, &ecb);
@@ -2546,6 +2728,10 @@ int diff_opt_parse(struct diff_options *options, const char **av, int ac)
DIFF_OPT_CLR(options, COLOR_DIFF);
else if (!strcmp(arg, "--color-words"))
options->flags |= DIFF_OPT_COLOR_DIFF | DIFF_OPT_COLOR_DIFF_WORDS;
+ else if (!prefixcmp(arg, "--color-words=")) {
+ options->flags |= DIFF_OPT_COLOR_DIFF | DIFF_OPT_COLOR_DIFF_WORDS;
+ options->word_regex = arg + 14;
+ }
else if (!strcmp(arg, "--exit-code"))
DIFF_OPT_SET(options, EXIT_WITH_STATUS);
else if (!strcmp(arg, "--quiet"))
diff --git a/diff.h b/diff.h
index 4d5a327..23cd90c 100644
--- a/diff.h
+++ b/diff.h
@@ -98,6 +98,7 @@ struct diff_options {
int stat_width;
int stat_name_width;
+ const char *word_regex;
/* this is set by diffcore for DIFF_FORMAT_PATCH */
int found_changes;
--
1.6.1.269.g0769
^ permalink raw reply related
* [PATCH v3 3/4] word diff: make regex configurable via attributes
From: Thomas Rast @ 2009-01-11 10:27 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Johannes Schindelin, Teemu Likonen
In-Reply-To: <7vr63atykr.fsf@gitster.siamese.dyndns.org>
Make the --color-words splitting regular expression configurable via
the diff driver's 'wordregex' attribute. The user can then set the
driver on a file in .gitattributes. If a regex is given on the
command line, it overrides the driver's setting.
We also provide built-in regexes for some of the languages that
already had funcname patterns. (They are designed to run UTF-8
sequences into a single chunk to make sure they remain readable.)
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---
Documentation/diff-options.txt | 4 +++-
Documentation/gitattributes.txt | 21 +++++++++++++++++++++
diff.c | 10 ++++++++++
userdiff.c | 27 +++++++++++++++++++--------
userdiff.h | 1 +
5 files changed, 54 insertions(+), 9 deletions(-)
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 6152d5b..d22c06b 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -96,7 +96,9 @@ endif::git-format-patch[]
By default, a new word only starts at whitespace, so that a
'word' is defined as a maximal sequence of non-whitespace
characters. The optional argument <regex> can be used to
- configure this.
+ configure this. It can also be set via a diff driver, see
+ linkgit:gitattributes[1]; if a <regex> is given explicitly, it
+ overrides any diff driver setting.
+
The <regex> must be an (extended) regular expression. When set, every
non-overlapping match of the <regex> is considered a word. (Regular
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 8af22ec..67f5522 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -334,6 +334,27 @@ patterns are available:
- `tex` suitable for source code for LaTeX documents.
+Customizing word diff
+^^^^^^^^^^^^^^^^^^^^^
+
+You can customize the rules that `git diff --color-words` uses to
+split words in a line, by specifying an appropriate regular expression
+in the "diff.*.wordregex" configuration variable. For example, in TeX
+a backslash followed by a sequence of letters forms a command, but
+several such commands can be run together without intervening
+whitespace. To separate them, use a regular expression such as
+
+------------------------
+[diff "tex"]
+ wordregex = "\\\\[a-zA-Z]+|[{}]|\\\\.|[^\\{} \t]+"
+------------------------
+
+Similar to 'xfuncname', a built in value is provided for the drivers
+`bibtex`, `html`, `java`, `php`, `python` and `tex`. See the
+documentation of --color-words in linkgit:git-diff[1] for the precise
+semantics.
+
+
Performing text diffs of binary files
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/diff.c b/diff.c
index badaea6..c1cc426 100644
--- a/diff.c
+++ b/diff.c
@@ -1548,6 +1548,12 @@ static const struct userdiff_funcname *diff_funcname_pattern(struct diff_filespe
return one->driver->funcname.pattern ? &one->driver->funcname : NULL;
}
+static const char *userdiff_word_regex(struct diff_filespec *one)
+{
+ diff_filespec_load_driver(one);
+ return one->driver->word_regex;
+}
+
void diff_set_mnemonic_prefix(struct diff_options *options, const char *a, const char *b)
{
if (!options->a_prefix)
@@ -1708,6 +1714,10 @@ static void builtin_diff(const char *name_a,
ecbdata.diff_words =
xcalloc(1, sizeof(struct diff_words_data));
ecbdata.diff_words->file = o->file;
+ if (!o->word_regex)
+ o->word_regex = userdiff_word_regex(one);
+ if (!o->word_regex)
+ o->word_regex = userdiff_word_regex(two);
if (o->word_regex) {
ecbdata.diff_words->word_regex = (regex_t *)
xmalloc(sizeof(regex_t));
diff --git a/userdiff.c b/userdiff.c
index 3681062..7fd9a07 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -6,13 +6,17 @@ static struct userdiff_driver *drivers;
static int ndrivers;
static int drivers_alloc;
-#define FUNCNAME(name, pattern) \
+#define FUNCNAME(name, pattern) \
{ name, NULL, -1, { pattern, REG_EXTENDED } }
+#define PATTERNS(name, pattern, wordregex) \
+ { name, NULL, -1, { pattern, REG_EXTENDED }, NULL, wordregex }
static struct userdiff_driver builtin_drivers[] = {
-FUNCNAME("html", "^[ \t]*(<[Hh][1-6][ \t].*>.*)$"),
-FUNCNAME("java",
+PATTERNS("html", "^[ \t]*(<[Hh][1-6][ \t].*>.*)$",
+ "[^<>= \t]+|\\S"),
+PATTERNS("java",
"!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n"
- "^[ \t]*(([ \t]*[A-Za-z_][A-Za-z_0-9]*){2,}[ \t]*\\([^;]*)$"),
+ "^[ \t]*(([ \t]*[A-Za-z_][A-Za-z_0-9]*){2,}[ \t]*\\([^;]*)$",
+ "[a-zA-Z_][a-zA-Z0-9_]*|[-+0-9.e]+|[-+*/]=|\\+\\+|--|\\S|[\x80-\xff]+"),
FUNCNAME("objc",
/* Negate C statements that can look like functions */
"!^[ \t]*(do|for|if|else|return|switch|while)\n"
@@ -27,14 +31,19 @@ FUNCNAME("pascal",
"implementation|initialization|finalization)[ \t]*.*)$"
"\n"
"^(.*=[ \t]*(class|record).*)$"),
-FUNCNAME("php", "^[\t ]*((function|class).*)"),
-FUNCNAME("python", "^[ \t]*((class|def)[ \t].*)$"),
+PATTERNS("php", "^[\t ]*((function|class).*)",
+ "\\$?[a-zA-Z_][a-zA-Z0-9_]*|[-+0-9.e]+|[-+*/]=|\\+\\+|--|->|\\S|[\x80-\xff]+"),
+PATTERNS("python", "^[ \t]*((class|def)[ \t].*)$",
+ "[a-zA-Z_][a-zA-Z0-9_]*|[-+0-9.e]+|[-+*/]=|//|\\S|[\x80-\xff]+"),
FUNCNAME("ruby", "^[ \t]*((class|module|def)[ \t].*)$"),
-FUNCNAME("bibtex", "(@[a-zA-Z]{1,}[ \t]*\\{{0,1}[ \t]*[^ \t\"@',\\#}{~%]*).*$"),
-FUNCNAME("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$"),
+PATTERNS("bibtex", "(@[a-zA-Z]{1,}[ \t]*\\{{0,1}[ \t]*[^ \t\"@',\\#}{~%]*).*$",
+ "[={}\"]|[^={}\" \t]+"),
+PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$",
+ "\\\\[a-zA-Z@]+|[{}]|\\\\.|[^\\{} \t]+"),
{ "default", NULL, -1, { NULL, 0 } },
};
#undef FUNCNAME
+#undef PATTERNS
static struct userdiff_driver driver_true = {
"diff=true",
@@ -134,6 +143,8 @@ int userdiff_config(const char *k, const char *v)
return parse_string(&drv->external, k, v);
if ((drv = parse_driver(k, v, "textconv")))
return parse_string(&drv->textconv, k, v);
+ if ((drv = parse_driver(k, v, "wordregex")))
+ return parse_string(&drv->word_regex, k, v);
return 0;
}
diff --git a/userdiff.h b/userdiff.h
index ba29457..2aab13e 100644
--- a/userdiff.h
+++ b/userdiff.h
@@ -12,6 +12,7 @@ struct userdiff_driver {
int binary;
struct userdiff_funcname funcname;
const char *textconv;
+ const char *word_regex;
};
int userdiff_config(const char *k, const char *v);
--
1.6.1.269.g0769
^ permalink raw reply related
* [PATCH v3 4/4] word diff: test customizable word splits
From: Thomas Rast @ 2009-01-11 10:27 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Johannes Schindelin, Teemu Likonen
In-Reply-To: <7vr63atykr.fsf@gitster.siamese.dyndns.org>
Several tests for regex-configured word splits via command line and
gitattributes. For good measure we also do a basic test of the
default --color-words since it was so far not covered at all.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---
t/t4033-diff-color-words.sh | 90 +++++++++++++++++++++++++++++++++++++++++++
1 files changed, 90 insertions(+), 0 deletions(-)
create mode 100755 t/t4033-diff-color-words.sh
diff --git a/t/t4033-diff-color-words.sh b/t/t4033-diff-color-words.sh
new file mode 100755
index 0000000..536cdac
--- /dev/null
+++ b/t/t4033-diff-color-words.sh
@@ -0,0 +1,90 @@
+#!/bin/sh
+
+
+test_description='diff --color-words'
+. ./test-lib.sh
+
+cat <<EOF > test_a
+foo_bar_baz
+a qu_ux b c
+alpha beta gamma delta
+EOF
+
+cat <<EOF > test_b
+foo_baz_baz
+a qu_new_ux b c
+alpha 4 2 delta
+EOF
+
+# t4026-diff-color.sh tests the color escapes, so we assume they do
+# not change
+
+munge () {
+ tail -n +5 | tr '\033' '!'
+}
+
+cat <<EOF > expect-plain
+![36m@@ -1,3 +1,3 @@![m
+![31mfoo_bar_baz![m![32mfoo_baz_baz![m
+a ![m![31mqu_ux ![m![32mqu_new_ux ![mb ![mc![m
+alpha ![m![31mbeta ![m![31mgamma ![m![32m4 ![m![32m2 ![mdelta![m
+EOF
+
+test_expect_success 'default settings' '
+ git diff --no-index --color-words test_a test_b |
+ munge > actual-plain &&
+ test_cmp expect-plain actual-plain
+'
+
+test_expect_success 'trivial regex yields same as default' '
+ git diff --no-index --color-words="\\S+" test_a test_b |
+ munge > actual-trivial &&
+ test_cmp expect-plain actual-trivial
+'
+
+cat <<EOF > expect-chars
+![36m@@ -1,3 +1,3 @@![m
+f![mo![mo![m_![mb![ma![m![31mr![m![32mz![m_![mb![ma![mz![m
+a ![mq![mu![m_![m![32mn![m![32me![m![32mw![m![32m_![mu![mx ![mb ![mc![m
+a![ml![mp![mh![ma ![m![31mb![m![31me![m![31mt![m![31ma ![m![31mg![m![31ma![m![31mm![m![31mm![m![31ma ![m![32m4 ![m![32m2 ![md![me![ml![mt![ma![m
+EOF
+
+test_expect_success 'character by character regex' '
+ git diff --no-index --color-words="\\S" test_a test_b |
+ munge > actual-chars &&
+ test_cmp expect-chars actual-chars
+'
+
+cat <<EOF > expect-nontrivial
+![36m@@ -1,3 +1,3 @@![m
+foo![m_![m![31mbar![m![32mbaz![m_![mbaz![m
+a ![mqu![m_![m![32mnew![m![32m_![mux ![mb ![mc![m
+alpha ![m![31mbeta ![m![31mgamma ![m![32m4![m![32m ![m![32m2![m![32m ![mdelta![m
+EOF
+
+test_expect_success 'nontrivial regex' '
+ git diff --no-index --color-words="[a-z]+|_" test_a test_b |
+ munge > actual-nontrivial &&
+ test_cmp expect-nontrivial actual-nontrivial
+'
+
+test_expect_success 'set a diff driver' '
+ git config diff.testdriver.wordregex "\\S" &&
+ cat <<EOF > .gitattributes
+test_* diff=testdriver
+EOF
+'
+
+test_expect_success 'use default supplied by driver' '
+ git diff --no-index --color-words test_a test_b |
+ munge > actual-chars-2 &&
+ test_cmp expect-chars actual-chars-2
+'
+
+test_expect_success 'option overrides default' '
+ git diff --no-index --color-words="[a-z]+|_" test_a test_b |
+ munge > actual-nontrivial-2 &&
+ test_cmp expect-nontrivial actual-nontrivial-2
+'
+
+test_done
--
1.6.1.269.g0769
^ permalink raw reply related
* Re: [PATCH] filter-branch: add git_commit_non_empty_tree and --prune-empty.
From: Pierre Habouzit @ 2009-01-11 11:18 UTC (permalink / raw)
To: Jay Soffian; +Cc: Junio C Hamano, git, pasky, srabbelier
In-Reply-To: <76718490901091129q534ca981iac54e0653d76170d@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 979 bytes --]
On Fri, Jan 09, 2009 at 07:29:15PM +0000, Jay Soffian wrote:
> On Mon, Nov 3, 2008 at 10:18 AM, Pierre Habouzit <madcoder@debian.org> wrote:
> > On Mon, Nov 03, 2008 at 09:27:29AM +0000, Pierre Habouzit wrote:
> >> On Mon, Nov 03, 2008 at 04:58:44AM +0000, Junio C Hamano wrote:
>
> Bump, http://thread.gmane.org/gmane.comp.version-control.git/99440/
>
> (I'd like to see this included. Having a bunch of empty commits after
> using filter-branch to remove unwanted files from history is, er,
> sub-optimal, so seems like it might even be default behavior?)
Yeah I have that in my own git tree, and I meant to ask if something had
to be fixed for it to be accepted for some time, but always forget about
it.
Junio, do you think this could be accepted, or does it need some work ?
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply
* Re: Funny: git -p submodule summary
From: Jeff King @ 2009-01-11 11:22 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Junio C Hamano, Johannes Schindelin, git
In-Reply-To: <496728B9.7090200@viscovery.net>
On Fri, Jan 09, 2009 at 11:36:41AM +0100, Johannes Sixt wrote:
> I'll test your other patch (that replaces the execvp in git.c by
> run_command).
There is something funny with it that I have not diagnosed: aliases are
broken, and "git foobar" does not return an error. Presumably just
checking the "we did not exec succesfully" case is not triggering
properly. However, I think the right solution is actually to refactor
git.c to figure out ahead of time whether we have a builtin, external,
or alias. I can work on that, but not tonight, as my git-time is up for
now.
But other than that, did it work for you on Windows?
However, here is a 4-patch series that handles the separate signal
delivery problem. It should fix the "^C makes funny things happen"
problems you were seeing. Please test and let me know how it works on
Windows.
The patches are:
1/4: Makefile: clean up TEST_PROGRAMS definition
2/4: chain kill signals for cleanup functions
3/4: refactor signal handling for cleanup functions
4/4: pager: do wait_for_pager on signal death
-Peff
^ permalink raw reply
* [PATCH 1/4] Makefile: clean up TEST_PROGRAMS definition
From: Jeff King @ 2009-01-11 11:25 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Junio C Hamano, Johannes Schindelin, git
In-Reply-To: <20090111112222.GA29656@coredump.intra.peff.net>
We try to keep lines under 80 characters, not to mention
that sticking a bunch of stuff on one line makes diffs
messier.
Signed-off-by: Jeff King <peff@peff.net>
---
Just a cleanup I noticed while working on 2/4.
Makefile | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/Makefile b/Makefile
index dee97c1..2b873fa 100644
--- a/Makefile
+++ b/Makefile
@@ -1356,7 +1356,14 @@ endif
### Testing rules
-TEST_PROGRAMS = test-chmtime$X test-genrandom$X test-date$X test-delta$X test-sha1$X test-match-trees$X test-parse-options$X test-path-utils$X
+TEST_PROGRAMS += test-chmtime$X
+TEST_PROGRAMS += test-date$X
+TEST_PROGRAMS += test-delta$X
+TEST_PROGRAMS += test-genrandom$X
+TEST_PROGRAMS += test-match-trees$X
+TEST_PROGRAMS += test-parse-options$X
+TEST_PROGRAMS += test-path-utils$X
+TEST_PROGRAMS += test-sha1$X
all:: $(TEST_PROGRAMS)
--
1.6.1.84.g8150
^ permalink raw reply related
* [PATCH 2/4] chain kill signals for cleanup functions
From: Jeff King @ 2009-01-11 11:32 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Junio C Hamano, Johannes Schindelin, git
In-Reply-To: <20090111112222.GA29656@coredump.intra.peff.net>
If a piece of code wanted to do some cleanup before exiting
(e.g., cleaing up a lockfile or a tempfile), our usual
strategy was to install a signal handler that did something
like this:
do_cleanup(); /* actual work */
signal(signo, SIG_DFL); /* restore previous behavior */
raise(signo); /* deliver signal, killing ourselves */
For a single handler, this works fine. However, if we want
to clean up two _different_ things, we run into a problem.
The most recently installed handler will run, but when it
removes itself as a handler, it doesn't put back the first
handler.
This patch introduces sigchain, a tiny library for handling
a stack of signal handlers. You sigchain_push each handler,
and use sigchain_pop to restore whoever was before you in
the stack.
Signed-off-by: Jeff King <peff@peff.net>
---
I don't know if it is possible to trigger this issue with current git or
not. There are several instances that catch signals in different ways,
but I don't know if there are code paths that actually combine them. So
maybe it is fixing a bug, but it is definitely needed for 4/4, which
will install the wait_for_pager handler in most git runs.
I tried to neither over- nor under-engineer. On one hand, you could
actually accomplish the same thing by always keeping a "this was the
last handler" pointer at each callsite when installing the new handler.
But it's easy to mess up, so I think a little library helps keep the
code clean.
On the other hand, callsites still have to have separate clean_foo() and
clean_foo_on_signal() handlers, and manually pop() and raise() in the
latter. If the code just assumed you always wanted to chain to the next
handler immediately, then we could cut out even more client code.
The test script is probably overkill, but I wrote it to convince myself
I hadn't screwed up too badly (and because we probably wouldn't even
notice, as signals aren't well tested in our scripts). So I figured I
might as well share it.
.gitignore | 1 +
Makefile | 3 +++
builtin-clone.c | 5 +++--
builtin-fetch--tool.c | 5 +++--
builtin-fetch.c | 5 +++--
diff.c | 5 +++--
http-push.c | 11 ++++++-----
lockfile.c | 13 +++++++------
sigchain.c | 43 +++++++++++++++++++++++++++++++++++++++++++
sigchain.h | 9 +++++++++
t/t0005-signals.sh | 18 ++++++++++++++++++
test-sigchain.c | 22 ++++++++++++++++++++++
12 files changed, 121 insertions(+), 19 deletions(-)
create mode 100644 sigchain.c
create mode 100644 sigchain.h
create mode 100755 t/t0005-signals.sh
create mode 100644 test-sigchain.c
diff --git a/.gitignore b/.gitignore
index d9adce5..f28a54d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -152,6 +152,7 @@ test-match-trees
test-parse-options
test-path-utils
test-sha1
+test-sigchain
common-cmds.h
*.tar.gz
*.dsc
diff --git a/Makefile b/Makefile
index 2b873fa..fd02dec 100644
--- a/Makefile
+++ b/Makefile
@@ -388,6 +388,7 @@ LIB_H += revision.h
LIB_H += run-command.h
LIB_H += sha1-lookup.h
LIB_H += sideband.h
+LIB_H += sigchain.h
LIB_H += strbuf.h
LIB_H += tag.h
LIB_H += transport.h
@@ -481,6 +482,7 @@ LIB_OBJS += sha1-lookup.o
LIB_OBJS += sha1_name.o
LIB_OBJS += shallow.o
LIB_OBJS += sideband.o
+LIB_OBJS += sigchain.o
LIB_OBJS += strbuf.o
LIB_OBJS += symlinks.o
LIB_OBJS += tag.o
@@ -1364,6 +1366,7 @@ TEST_PROGRAMS += test-match-trees$X
TEST_PROGRAMS += test-parse-options$X
TEST_PROGRAMS += test-path-utils$X
TEST_PROGRAMS += test-sha1$X
+TEST_PROGRAMS += test-sigchain$X
all:: $(TEST_PROGRAMS)
diff --git a/builtin-clone.c b/builtin-clone.c
index f1a1a0c..18b9392 100644
--- a/builtin-clone.c
+++ b/builtin-clone.c
@@ -19,6 +19,7 @@
#include "strbuf.h"
#include "dir.h"
#include "pack-refs.h"
+#include "sigchain.h"
/*
* Overall FIXMEs:
@@ -288,7 +289,7 @@ static void remove_junk(void)
static void remove_junk_on_signal(int signo)
{
remove_junk();
- signal(SIGINT, SIG_DFL);
+ sigchain_pop(signo);
raise(signo);
}
@@ -438,7 +439,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
}
junk_git_dir = git_dir;
atexit(remove_junk);
- signal(SIGINT, remove_junk_on_signal);
+ sigchain_push(SIGINT, remove_junk_on_signal);
setenv(CONFIG_ENVIRONMENT, xstrdup(mkpath("%s/config", git_dir)), 1);
diff --git a/builtin-fetch--tool.c b/builtin-fetch--tool.c
index 469b07e..b1d7f8f 100644
--- a/builtin-fetch--tool.c
+++ b/builtin-fetch--tool.c
@@ -2,6 +2,7 @@
#include "cache.h"
#include "refs.h"
#include "commit.h"
+#include "sigchain.h"
static char *get_stdin(void)
{
@@ -186,7 +187,7 @@ static void remove_keep(void)
static void remove_keep_on_signal(int signo)
{
remove_keep();
- signal(SIGINT, SIG_DFL);
+ sigchain_pop(signo);
raise(signo);
}
@@ -245,7 +246,7 @@ static int fetch_native_store(FILE *fp,
char buffer[1024];
int err = 0;
- signal(SIGINT, remove_keep_on_signal);
+ sigchain_push(SIGINT, remove_keep_on_signal);
atexit(remove_keep);
while (fgets(buffer, sizeof(buffer), stdin)) {
diff --git a/builtin-fetch.c b/builtin-fetch.c
index de6f307..8c86974 100644
--- a/builtin-fetch.c
+++ b/builtin-fetch.c
@@ -10,6 +10,7 @@
#include "transport.h"
#include "run-command.h"
#include "parse-options.h"
+#include "sigchain.h"
static const char * const builtin_fetch_usage[] = {
"git fetch [options] [<repository> <refspec>...]",
@@ -58,7 +59,7 @@ static void unlock_pack(void)
static void unlock_pack_on_signal(int signo)
{
unlock_pack();
- signal(SIGINT, SIG_DFL);
+ sigchain_pop(signo);
raise(signo);
}
@@ -672,7 +673,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
ref_nr = j;
}
- signal(SIGINT, unlock_pack_on_signal);
+ sigchain_push(SIGINT, unlock_pack_on_signal);
atexit(unlock_pack);
exit_code = do_fetch(transport,
parse_fetch_refspec(ref_nr, refs), ref_nr);
diff --git a/diff.c b/diff.c
index d235482..5a74012 100644
--- a/diff.c
+++ b/diff.c
@@ -12,6 +12,7 @@
#include "run-command.h"
#include "utf8.h"
#include "userdiff.h"
+#include "sigchain.h"
#ifdef NO_FAST_WORKING_DIRECTORY
#define FAST_WORKING_DIRECTORY 0
@@ -1933,7 +1934,7 @@ static void remove_tempfile(void)
static void remove_tempfile_on_signal(int signo)
{
remove_tempfile();
- signal(SIGINT, SIG_DFL);
+ sigchain_pop(signo);
raise(signo);
}
@@ -1968,7 +1969,7 @@ static void run_external_diff(const char *pgm,
atexit_asked = 1;
atexit(remove_tempfile);
}
- signal(SIGINT, remove_tempfile_on_signal);
+ sigchain_push(SIGINT, remove_tempfile_on_signal);
}
if (one && two) {
diff --git a/http-push.c b/http-push.c
index a4b7d08..dec395d 100644
--- a/http-push.c
+++ b/http-push.c
@@ -10,6 +10,7 @@
#include "exec_cmd.h"
#include "remote.h"
#include "list-objects.h"
+#include "sigchain.h"
#include <expat.h>
@@ -1363,7 +1364,7 @@ static void remove_locks(void)
static void remove_locks_on_signal(int signo)
{
remove_locks();
- signal(signo, SIG_DFL);
+ sigchain_pop(signo);
raise(signo);
}
@@ -2261,10 +2262,10 @@ int main(int argc, char **argv)
goto cleanup;
}
- signal(SIGINT, remove_locks_on_signal);
- signal(SIGHUP, remove_locks_on_signal);
- signal(SIGQUIT, remove_locks_on_signal);
- signal(SIGTERM, remove_locks_on_signal);
+ sigchain_push(SIGINT, remove_locks_on_signal);
+ sigchain_push(SIGHUP, remove_locks_on_signal);
+ sigchain_push(SIGQUIT, remove_locks_on_signal);
+ sigchain_push(SIGTERM, remove_locks_on_signal);
/* Check whether the remote has server info files */
remote->can_update_info_refs = 0;
diff --git a/lockfile.c b/lockfile.c
index 8589155..3cd57dc 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -2,6 +2,7 @@
* Copyright (c) 2005, Junio C Hamano
*/
#include "cache.h"
+#include "sigchain.h"
static struct lock_file *lock_file_list;
static const char *alternate_index_output;
@@ -24,7 +25,7 @@ static void remove_lock_file(void)
static void remove_lock_file_on_signal(int signo)
{
remove_lock_file();
- signal(signo, SIG_DFL);
+ sigchain_pop(signo);
raise(signo);
}
@@ -136,11 +137,11 @@ static int lock_file(struct lock_file *lk, const char *path, int flags)
lk->fd = open(lk->filename, O_RDWR | O_CREAT | O_EXCL, 0666);
if (0 <= lk->fd) {
if (!lock_file_list) {
- signal(SIGINT, remove_lock_file_on_signal);
- signal(SIGHUP, remove_lock_file_on_signal);
- signal(SIGTERM, remove_lock_file_on_signal);
- signal(SIGQUIT, remove_lock_file_on_signal);
- signal(SIGPIPE, remove_lock_file_on_signal);
+ sigchain_push(SIGINT, remove_lock_file_on_signal);
+ sigchain_push(SIGHUP, remove_lock_file_on_signal);
+ sigchain_push(SIGTERM, remove_lock_file_on_signal);
+ sigchain_push(SIGQUIT, remove_lock_file_on_signal);
+ sigchain_push(SIGPIPE, remove_lock_file_on_signal);
atexit(remove_lock_file);
}
lk->owner = getpid();
diff --git a/sigchain.c b/sigchain.c
new file mode 100644
index 0000000..a18d505
--- /dev/null
+++ b/sigchain.c
@@ -0,0 +1,43 @@
+#include "sigchain.h"
+#include "cache.h"
+
+#define SIGCHAIN_MAX_SIGNALS 32
+
+struct sigchain_signal {
+ sigchain_fun *old;
+ int n;
+ int alloc;
+};
+static struct sigchain_signal signals[SIGCHAIN_MAX_SIGNALS];
+
+static void check_signum(int sig)
+{
+ if (sig < 1 || sig >= SIGCHAIN_MAX_SIGNALS)
+ die("BUG: signal out of range: %d", sig);
+}
+
+int sigchain_push(int sig, sigchain_fun f)
+{
+ struct sigchain_signal *s = signals + sig;
+ check_signum(sig);
+
+ ALLOC_GROW(s->old, s->n + 1, s->alloc);
+ s->old[s->n] = signal(sig, f);
+ if (s->old[s->n] == SIG_ERR)
+ return -1;
+ s->n++;
+ return 0;
+}
+
+int sigchain_pop(int sig)
+{
+ struct sigchain_signal *s = signals + sig;
+ check_signum(sig);
+ if (s->n < 1)
+ return 0;
+
+ if (signal(sig, s->old[s->n - 1]) == SIG_ERR)
+ return -1;
+ s->n--;
+ return 0;
+}
diff --git a/sigchain.h b/sigchain.h
new file mode 100644
index 0000000..254ebb0
--- /dev/null
+++ b/sigchain.h
@@ -0,0 +1,9 @@
+#ifndef SIGCHAIN_H
+#define SIGCHAIN_H
+
+typedef void (*sigchain_fun)(int);
+
+int sigchain_push(int sig, sigchain_fun f);
+int sigchain_pop(int sig);
+
+#endif /* SIGCHAIN_H */
diff --git a/t/t0005-signals.sh b/t/t0005-signals.sh
new file mode 100755
index 0000000..d900df4
--- /dev/null
+++ b/t/t0005-signals.sh
@@ -0,0 +1,18 @@
+#!/bin/sh
+
+test_description='signals work as we expect'
+. ./test-lib.sh
+
+cat >expect <<EOF
+three
+two
+one
+EOF
+
+test_expect_success 'sigchain works' '
+ test-sigchain >actual
+ test "$?" = 130 &&
+ test_cmp expect actual
+'
+
+test_done
diff --git a/test-sigchain.c b/test-sigchain.c
new file mode 100644
index 0000000..8747dea
--- /dev/null
+++ b/test-sigchain.c
@@ -0,0 +1,22 @@
+#include "sigchain.h"
+#include "cache.h"
+
+#define X(f) \
+static void f(int sig) { \
+ puts(#f); \
+ fflush(stdout); \
+ sigchain_pop(sig); \
+ raise(sig); \
+}
+X(one)
+X(two)
+X(three)
+#undef X
+
+int main(int argc, char **argv) {
+ sigchain_push(SIGINT, one);
+ sigchain_push(SIGINT, two);
+ sigchain_push(SIGINT, three);
+ raise(SIGINT);
+ return 0;
+}
--
1.6.1.84.g8150
^ permalink raw reply related
* [PATCH 3/4] refactor signal handling for cleanup functions
From: Jeff King @ 2009-01-11 11:36 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Junio C Hamano, Johannes Schindelin, git
In-Reply-To: <20090111112222.GA29656@coredump.intra.peff.net>
The current code is very inconsistent about which signals
are caught for doing cleanup of temporary files and lock
files. Some callsites checked only SIGINT, while others
checked a variety of death-dealing signals.
This patch factors out those signals to a single function,
and then calls it everywhere. For some sites, that means
this is a simple clean up. For others, it is an improvement
in that they will now properly clean themselves up after a
larger variety of signals.
Signed-off-by: Jeff King <peff@peff.net>
---
I'm assuming there was no good reason _not_ to be handling those other
signals at some of the "just handle SIGINT" sites. A sigchain
implementation which handled "remove_lock_file" without needing
"remove_lock_file_on_signal" could also call atexit(), too. So you could
have a one liner
register_cleanup(remove_lock_file);
However, there is one case that doesn't use an atexit() handler:
http-push.c. I don't know if this is a bug or an intentional omission.
builtin-clone.c | 2 +-
builtin-fetch--tool.c | 2 +-
builtin-fetch.c | 2 +-
diff.c | 2 +-
http-push.c | 5 +----
lockfile.c | 6 +-----
sigchain.c | 9 +++++++++
sigchain.h | 2 ++
8 files changed, 17 insertions(+), 13 deletions(-)
diff --git a/builtin-clone.c b/builtin-clone.c
index 18b9392..44c8073 100644
--- a/builtin-clone.c
+++ b/builtin-clone.c
@@ -439,7 +439,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
}
junk_git_dir = git_dir;
atexit(remove_junk);
- sigchain_push(SIGINT, remove_junk_on_signal);
+ sigchain_push_common(remove_junk_on_signal);
setenv(CONFIG_ENVIRONMENT, xstrdup(mkpath("%s/config", git_dir)), 1);
diff --git a/builtin-fetch--tool.c b/builtin-fetch--tool.c
index b1d7f8f..29356d2 100644
--- a/builtin-fetch--tool.c
+++ b/builtin-fetch--tool.c
@@ -246,7 +246,7 @@ static int fetch_native_store(FILE *fp,
char buffer[1024];
int err = 0;
- sigchain_push(SIGINT, remove_keep_on_signal);
+ sigchain_push_common(remove_keep_on_signal);
atexit(remove_keep);
while (fgets(buffer, sizeof(buffer), stdin)) {
diff --git a/builtin-fetch.c b/builtin-fetch.c
index 8c86974..1e4a3d9 100644
--- a/builtin-fetch.c
+++ b/builtin-fetch.c
@@ -673,7 +673,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
ref_nr = j;
}
- sigchain_push(SIGINT, unlock_pack_on_signal);
+ sigchain_push_common(unlock_pack_on_signal);
atexit(unlock_pack);
exit_code = do_fetch(transport,
parse_fetch_refspec(ref_nr, refs), ref_nr);
diff --git a/diff.c b/diff.c
index 5a74012..7fc8512 100644
--- a/diff.c
+++ b/diff.c
@@ -1969,7 +1969,7 @@ static void run_external_diff(const char *pgm,
atexit_asked = 1;
atexit(remove_tempfile);
}
- sigchain_push(SIGINT, remove_tempfile_on_signal);
+ sigchain_push_common(remove_tempfile_on_signal);
}
if (one && two) {
diff --git a/http-push.c b/http-push.c
index dec395d..7d5c23e 100644
--- a/http-push.c
+++ b/http-push.c
@@ -2262,10 +2262,7 @@ int main(int argc, char **argv)
goto cleanup;
}
- sigchain_push(SIGINT, remove_locks_on_signal);
- sigchain_push(SIGHUP, remove_locks_on_signal);
- sigchain_push(SIGQUIT, remove_locks_on_signal);
- sigchain_push(SIGTERM, remove_locks_on_signal);
+ sigchain_push_common(remove_locks_on_signal);
/* Check whether the remote has server info files */
remote->can_update_info_refs = 0;
diff --git a/lockfile.c b/lockfile.c
index 3cd57dc..021c337 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -137,11 +137,7 @@ static int lock_file(struct lock_file *lk, const char *path, int flags)
lk->fd = open(lk->filename, O_RDWR | O_CREAT | O_EXCL, 0666);
if (0 <= lk->fd) {
if (!lock_file_list) {
- sigchain_push(SIGINT, remove_lock_file_on_signal);
- sigchain_push(SIGHUP, remove_lock_file_on_signal);
- sigchain_push(SIGTERM, remove_lock_file_on_signal);
- sigchain_push(SIGQUIT, remove_lock_file_on_signal);
- sigchain_push(SIGPIPE, remove_lock_file_on_signal);
+ sigchain_push_common(remove_lock_file_on_signal);
atexit(remove_lock_file);
}
lk->owner = getpid();
diff --git a/sigchain.c b/sigchain.c
index a18d505..1118b99 100644
--- a/sigchain.c
+++ b/sigchain.c
@@ -41,3 +41,12 @@ int sigchain_pop(int sig)
s->n--;
return 0;
}
+
+void sigchain_push_common(sigchain_fun f)
+{
+ sigchain_push(SIGINT, f);
+ sigchain_push(SIGHUP, f);
+ sigchain_push(SIGTERM, f);
+ sigchain_push(SIGQUIT, f);
+ sigchain_push(SIGPIPE, f);
+}
diff --git a/sigchain.h b/sigchain.h
index 254ebb0..618083b 100644
--- a/sigchain.h
+++ b/sigchain.h
@@ -6,4 +6,6 @@ typedef void (*sigchain_fun)(int);
int sigchain_push(int sig, sigchain_fun f);
int sigchain_pop(int sig);
+void sigchain_push_common(sigchain_fun f);
+
#endif /* SIGCHAIN_H */
--
1.6.1.84.g8150
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox