Git development
 help / color / mirror / Atom feed
* Re: Version 1.8.1 does not compile on Cygwin 1.7.14
From: Jonathan Nieder @ 2013-01-06 12:09 UTC (permalink / raw)
  To: Mark Levedahl
  Cc: Torsten Bögershausen, Stephen & Linda Smith,
	Jason Pyeron, git, Eric Blake
In-Reply-To: <50E9647F.4090209@gmail.com>

Mark Levedahl wrote:

>                                                          However, the newer
> win32api is provided only for the current cygwin release series, which can
> be reliably identified by having dll version 1.7.x, while the older frozen
> releases (dll versions 1.6.x from redhat, 1.5.x open source) still have the
> older api as no updates are being made for the legacy version(s).

Ah.  That makes sense, thanks.

(For the future, if we wanted to diagnose an out-of-date win32api and
print a helpful message, I guess cygcheck would be the command to use.)

^ permalink raw reply

* Re: [PATCH] docs: manpage XML depends on asciidoc.conf
From: John Keeping @ 2013-01-06 12:33 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, git, Sergey Vlasov, Thomas Ackermann
In-Reply-To: <20130106120153.GB22081@elie.Belkin>

On Sun, Jan 06, 2013 at 04:01:53AM -0800, Jonathan Nieder wrote:
> When building manual pages, the source text is transformed to XML with
> AsciiDoc before the man pages are generated from the XML with xmlto.
> 
> Fix the dependencies in the Makefile so that the XML files are rebuilt
> when asciidoc.conf changes and not just the manual pages from
> unchanged XML, and move the dependencies from a recipeless rule to the
> rules with commands that use asciidoc.conf to make the dependencies
> easier to understand and maintain.
> 
> Reported-by: John Keeping <john@keeping.me.uk>
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> ---

This fixes the problem I wanted to fix (as well as being clearer for the
future), so FWIW:

Tested-by: John Keeping <john@keeping.me.uk>

^ permalink raw reply

* [PATCH] git-fast-import(1): reorganise options
From: John Keeping @ 2013-01-06 13:29 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, Eric S. Raymond, git, David Michael Barr,
	Pete Wyckoff
In-Reply-To: <7vy5g6okdi.fsf@alter.siamese.dyndns.org>

On Sat, Jan 05, 2013 at 11:12:25PM -0800, Junio C Hamano wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:
>> But in fact the current options list doesn't seem to be well organized at all.
>> What do you think would be a logical way to group these?
>>
>>  Features of input syntax
>>
>> 	--date-format
>> 	--done
>>
>>  Verbosity
>>
>> 	--quiet
>> 	--stats
>>
>>  Marks handling (checkpoint/restore)
>>
>> 	--import-marks
>> 	--import-marks-if-exists
>> 	--export-marks
>> 	--relative-marks
>>
>>  Semantics of execution
>>
>> 	--dry-run
>> 	--force
>> 	--cat-blob-fd
>> 	--export-pack-edges
>>
>>  Tuning
>>
>> 	--active-branches
>> 	--max-pack-size
>> 	--big-file-threshold
>> 	--depth
> 
> Sounds sensible.

How about this?

I left the "Semantics of execution" options with the general options
since I couldn't think of a sensible heading that didn't also apply to
'--quiet' or '--stats', but I think the result is reasonable.

-- <8 --

The options in git-fast-import(1) are not currently arranged in a
logical order, which has caused the '--done' options to be documented
twice (commit 3266de10).

Rearrange them into logical groups under subheadings.

While doing this, fix the duplicate '--done' documentation by taking the
best bits of each.  Also combine the descriptions of '--relative-marks'
and '--no-relative-marks' since they make more sense together.

Suggested-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: John Keeping <john@keeping.me.uk>
---
 Documentation/git-fast-import.txt | 115 +++++++++++++++++++-------------------
 1 file changed, 59 insertions(+), 56 deletions(-)

diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt
index 68bca1a..0e25c8d 100644
--- a/Documentation/git-fast-import.txt
+++ b/Documentation/git-fast-import.txt
@@ -33,38 +33,55 @@ the frontend program in use.
 
 OPTIONS
 -------
---date-format=<fmt>::
-	Specify the type of dates the frontend will supply to
-	fast-import within `author`, `committer` and `tagger` commands.
-	See ``Date Formats'' below for details about which formats
-	are supported, and their syntax.
 
--- done::
-	Terminate with error if there is no 'done' command at the
-	end of the stream.
+--quiet::
+	Disable all non-fatal output, making fast-import silent when it
+	is successful.  This option disables the output shown by
+	\--stats.
+
+--stats::
+	Display some basic statistics about the objects fast-import has
+	created, the packfiles they were stored into, and the
+	memory used by fast-import during this run.  Showing this output
+	is currently the default, but can be disabled with \--quiet.
 
 --force::
 	Force updating modified existing branches, even if doing
 	so would cause commits to be lost (as the new commit does
 	not contain the old commit).
 
---max-pack-size=<n>::
-	Maximum size of each output packfile.
-	The default is unlimited.
+--cat-blob-fd=<fd>::
+	Write responses to `cat-blob` and `ls` queries to the
+	file descriptor <fd> instead of `stdout`.  Allows `progress`
+	output intended for the end-user to be separated from other
+	output.
 
---big-file-threshold=<n>::
-	Maximum size of a blob that fast-import will attempt to
-	create a delta for, expressed in bytes.  The default is 512m
-	(512 MiB).  Some importers may wish to lower this on systems
-	with constrained memory.
+--export-pack-edges=<file>::
+	After creating a packfile, print a line of data to
+	<file> listing the filename of the packfile and the last
+	commit on each branch that was written to that packfile.
+	This information may be useful after importing projects
+	whose total object set exceeds the 4 GiB packfile limit,
+	as these commits can be used as edge points during calls
+	to 'git pack-objects'.
 
---depth=<n>::
-	Maximum delta depth, for blob and tree deltification.
-	Default is 10.
+Options related to the input stream
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
---active-branches=<n>::
-	Maximum number of branches to maintain active at once.
-	See ``Memory Utilization'' below for details.  Default is 5.
+--date-format=<fmt>::
+	Specify the type of dates the frontend will supply to
+	fast-import within `author`, `committer` and `tagger` commands.
+	See ``Date Formats'' below for details about which formats
+	are supported, and their syntax.
+
+--done::
+	Terminate with error if there is no `done` command at the end of
+	the stream.  This option might be useful for detecting errors
+	that cause the frontend to terminate before it has started to
+	write a stream.
+
+Options related to marks
+~~~~~~~~~~~~~~~~~~~~~~~~
 
 --export-marks=<file>::
 	Dumps the internal marks table to <file> when complete.
@@ -87,51 +104,37 @@ OPTIONS
 	Like --import-marks but instead of erroring out, silently
 	skips the file if it does not exist.
 
---relative-marks::
+--[no-]relative-marks::
 	After specifying --relative-marks the paths specified
 	with --import-marks= and --export-marks= are relative
 	to an internal directory in the current repository.
 	In git-fast-import this means that the paths are relative
 	to the .git/info/fast-import directory. However, other
 	importers may use a different location.
++
+Relative and non-relative marks may be combined by interweaving
+--(no-)-relative-marks with the --(import|export)-marks= options.
 
---no-relative-marks::
-	Negates a previous --relative-marks. Allows for combining
-	relative and non-relative marks by interweaving
-	--(no-)-relative-marks with the --(import|export)-marks=
-	options.
+Options for tuning
+~~~~~~~~~~~~~~~~~~
 
---cat-blob-fd=<fd>::
-	Write responses to `cat-blob` and `ls` queries to the
-	file descriptor <fd> instead of `stdout`.  Allows `progress`
-	output intended for the end-user to be separated from other
-	output.
-
---done::
-	Require a `done` command at the end of the stream.
-	This option might be useful for detecting errors that
-	cause the frontend to terminate before it has started to
-	write a stream.
+--active-branches=<n>::
+	Maximum number of branches to maintain active at once.
+	See ``Memory Utilization'' below for details.  Default is 5.
 
---export-pack-edges=<file>::
-	After creating a packfile, print a line of data to
-	<file> listing the filename of the packfile and the last
-	commit on each branch that was written to that packfile.
-	This information may be useful after importing projects
-	whose total object set exceeds the 4 GiB packfile limit,
-	as these commits can be used as edge points during calls
-	to 'git pack-objects'.
+--big-file-threshold=<n>::
+	Maximum size of a blob that fast-import will attempt to
+	create a delta for, expressed in bytes.  The default is 512m
+	(512 MiB).  Some importers may wish to lower this on systems
+	with constrained memory.
 
---quiet::
-	Disable all non-fatal output, making fast-import silent when it
-	is successful.  This option disables the output shown by
-	\--stats.
+--depth=<n>::
+	Maximum delta depth, for blob and tree deltification.
+	Default is 10.
 
---stats::
-	Display some basic statistics about the objects fast-import has
-	created, the packfiles they were stored into, and the
-	memory used by fast-import during this run.  Showing this output
-	is currently the default, but can be disabled with \--quiet.
+--max-pack-size=<n>::
+	Maximum size of each output packfile.
+	The default is unlimited.
 
 
 Performance
-- 
1.8.0.2

^ permalink raw reply related

* [PATCH/RFC] fast-import doc: split OPTIONS into subsections
From: Jonathan Nieder @ 2013-01-06 13:34 UTC (permalink / raw)
  To: git
  Cc: John Keeping, Eric S. Raymond, David Michael Barr, Pete Wyckoff,
	Thomas Rast
In-Reply-To: <20130105231151.GD3247@elie.Belkin>

The OPTIONS section of this manpage has grown long without any
particular organization to ensure it remains manageable.  Split into
categories to make the documentation for each option easier to find.
The categories:

 1. Features of the input format, such as the date format and whether
    the file is required to end with a "done" command.
 2. How much output the importer should write to stderr.
 3. Marks Handling (Checkpoint/Restart).
 4. Other options that change the behavior in a semantically
    meaningful way (backflow pipe setup, whether to force ref
    updates, where to list pack edges).
 5. Performance and compression tuning.

Reported-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
The second-to-last subsection ("Import Semantics") is kind of a
catch-all.  Better ideas for organization or naming would be
welcome.

 Documentation/git-fast-import.txt | 82 ++++++++++++++++++++++-----------------
 1 file changed, 46 insertions(+), 36 deletions(-)

diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt
index d2c0e357..1676d436 100644
--- a/Documentation/git-fast-import.txt
+++ b/Documentation/git-fast-import.txt
@@ -32,35 +32,36 @@ the frontend program in use.
 
 OPTIONS
 -------
+Input Syntax
+~~~~~~~~~~~~
 --date-format=<fmt>::
 	Specify the type of dates the frontend will supply to
 	fast-import within `author`, `committer` and `tagger` commands.
 	See ``Date Formats'' below for details about which formats
 	are supported, and their syntax.
 
---force::
-	Force updating modified existing branches, even if doing
-	so would cause commits to be lost (as the new commit does
-	not contain the old commit).
+--done::
+	Terminate with error if there is no 'done' command at the
+	end of the stream.
+	This option might be useful for detecting errors that
+	cause the frontend to terminate before it has started to
+	write a stream.
 
---max-pack-size=<n>::
-	Maximum size of each output packfile.
-	The default is unlimited.
+Verbosity
+~~~~~~~~~
+--quiet::
+	Disable all non-fatal output, making fast-import silent when it
+	is successful.  This option disables the output shown by
+	\--stats.
 
---big-file-threshold=<n>::
-	Maximum size of a blob that fast-import will attempt to
-	create a delta for, expressed in bytes.  The default is 512m
-	(512 MiB).  Some importers may wish to lower this on systems
-	with constrained memory.
-
---depth=<n>::
-	Maximum delta depth, for blob and tree deltification.
-	Default is 10.
-
---active-branches=<n>::
-	Maximum number of branches to maintain active at once.
-	See ``Memory Utilization'' below for details.  Default is 5.
+--stats::
+	Display some basic statistics about the objects fast-import has
+	created, the packfiles they were stored into, and the
+	memory used by fast-import during this run.  Showing this output
+	is currently the default, but can be disabled with \--quiet.
 
+Marks Handling (Checkpoint/Restart)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 --export-marks=<file>::
 	Dumps the internal marks table to <file> when complete.
 	Marks are written one per line as `:markid SHA-1`.
@@ -96,18 +97,18 @@ OPTIONS
 	--(no-)-relative-marks with the --(import|export)-marks=
 	options.
 
+Import Semantics
+~~~~~~~~~~~~~~~~
+--force::
+	Force updating modified existing branches, even if doing
+	so would cause commits to be lost (as the new commit does
+	not contain the old commit).
+
 --cat-blob-fd=<fd>::
 	Specify the file descriptor that will be written to
 	when the `cat-blob` command is encountered in the stream.
 	The default behaviour is to write to `stdout`.
 
---done::
-	Terminate with error if there is no 'done' command at the
-	end of the stream.
-	This option might be useful for detecting errors that
-	cause the frontend to terminate before it has started to
-	write a stream.
-
 --export-pack-edges=<file>::
 	After creating a packfile, print a line of data to
 	<file> listing the filename of the packfile and the last
@@ -117,16 +118,25 @@ OPTIONS
 	as these commits can be used as edge points during calls
 	to 'git pack-objects'.
 
---quiet::
-	Disable all non-fatal output, making fast-import silent when it
-	is successful.  This option disables the output shown by
-	\--stats.
+Tuning
+~~~~~~
+--max-pack-size=<n>::
+	Maximum size of each output packfile.
+	The default is unlimited.
 
---stats::
-	Display some basic statistics about the objects fast-import has
-	created, the packfiles they were stored into, and the
-	memory used by fast-import during this run.  Showing this output
-	is currently the default, but can be disabled with \--quiet.
+--big-file-threshold=<n>::
+	Maximum size of a blob that fast-import will attempt to
+	create a delta for, expressed in bytes.  The default is 512m
+	(512 MiB).  Some importers may wish to lower this on systems
+	with constrained memory.
+
+--depth=<n>::
+	Maximum delta depth, for blob and tree deltification.
+	Default is 10.
+
+--active-branches=<n>::
+	Maximum number of branches to maintain active at once.
+	See ``Memory Utilization'' below for details.  Default is 5.
 
 
 Performance
-- 
1.8.1

^ permalink raw reply related

* Re: [PATCH] git-fast-import(1): reorganise options
From: Jonathan Nieder @ 2013-01-06 13:51 UTC (permalink / raw)
  To: John Keeping
  Cc: Junio C Hamano, Eric S. Raymond, git, David Michael Barr,
	Pete Wyckoff, Thomas Rast
In-Reply-To: <20130106132915.GG6440@serenity.lan>

John Keeping wrote:

> How about this?

Ah, our patches crossed.

> I left the "Semantics of execution" options with the general options
> since I couldn't think of a sensible heading

Neat trick. :)

[...]
> -- <8 --
> The options in git-fast-import(1) are not currently arranged in a
> logical order, which has caused the '--done' options to be documented
> twice (commit 3266de10).
>
> Rearrange them into logical groups under subheadings.

Nice description.

> While doing this, fix the duplicate '--done' documentation by taking the
> best bits of each.  Also combine the descriptions of '--relative-marks'
> and '--no-relative-marks' since they make more sense together.

I'd prefer to keep those as separate patches, if that's manageable.

The organization you propose is:

	OPTIONS
	-------
	--quiet
	--stats
	--force
	--cat-blob-fd
	--export-pack-edges

	Options related to the input stream
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	--date-format
	--done

	Options related to marks
	~~~~~~~~~~~~~~~~~~~~~~~~
	--export-marks
	--import-marks
	--import-marks-if-exists
	--relative-marks
	--no-relative-marks

	Options for tuning
	~~~~~~~~~~~~~~~~~~
	--active-branches
	--big-file-threshold
	--depth
	--max-pack-size

These headings are less cryptic than the ones I proposed, which is a
nice thing.

My only nitpicks:

I'd worry that the catch-all toplevel category would grow larger
and larger with time, since it's the obvious place to put any new
option.

Part of what I tried to do with the proposed categorization was to
separate options that change the semantics of the import (which one
uses with "feature" when they are specified in the fast-import stream
since ignoring them results in a broken import) from options that only
change superficial aspects of the interface or the details of how the
resulting packfiles representing the same objects get written.

The phrasing of the name of the category "Options related to the input
stream" is too broad.  All options relate to the input stream, since
consuming an input stream and acting on it is all fast-import does.
Something more specific than "related to" and a mention of "syntax"
could make it clearer --- how about something like "Input Syntax
Features"?

Likewise, lots of functionality is _related_ to marks, but the marks
options are the options that specify marks files.  I don't know a good
way to say that --- maybe "Location of Marks Files"?

"Options for Tuning" could also be made more specific --- e.g.,
"Performance and Compression Tuning".

I like how you put important options like --force on top.  Perhaps
the less important --quiet and --stats could be split off from that
into a subsection like "Verbosity" to make them stand out even more.

Generally I think this is a better starting point for future work than
the patch I sent.  Thanks for writing it.

Jonathan

^ permalink raw reply

* [PATCH jk/pathspec-literal] t6130-pathspec-noglob: Windows does not allow a file named "f*"
From: Johannes Sixt @ 2013-01-06 14:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git Mailing List, msysGit

Windows disallows file names that contain a star. Arrange the test setup
to insert the file name "f*" in the repository without the corresponding
file in the worktree.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 t/t6130-pathspec-noglob.sh | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/t/t6130-pathspec-noglob.sh b/t/t6130-pathspec-noglob.sh
index bb5e710..39ef619 100755
--- a/t/t6130-pathspec-noglob.sh
+++ b/t/t6130-pathspec-noglob.sh
@@ -6,7 +6,13 @@ test_description='test globbing (and noglob) of pathspec limiting'
 test_expect_success 'create commits with glob characters' '
 	test_commit unrelated bar &&
 	test_commit vanilla foo &&
-	test_commit star "f*" &&
+	# insert file "f*" in the commit, but in a way that avoids
+	# the name "f*" in the worktree, because it is not allowed
+	# on Windows (the tests below do not depend on the presence
+	# of the file in the worktree)
+	git update-index --add --cacheinfo 100644 "$(git rev-parse HEAD:foo)" "f*" &&
+	test_tick &&
+	git commit -m star &&
 	test_commit bracket "f[o][o]"
 '
 
-- 
1.8.1.1672.g5e2a3d4.dirty

-- 
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.

You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

^ permalink raw reply related

* Re: Version 1.8.1 does not compile on Cygwin 1.7.14
From: Stephen & Linda Smith @ 2013-01-06 14:09 UTC (permalink / raw)
  To: ischis2
  Cc: Jonathan Nieder, Mark Levedahl, Jason Pyeron, Eric Blake, git,
	Torsten Bögershausen
In-Reply-To: <20130106120917.GC22081@elie.Belkin>

On Sunday, January 06, 2013 04:09:17 AM Jonathan Nieder wrote:
> Mark Levedahl wrote:
> >                                                          However, the
> >                                                          newer
> > 
> > win32api is provided only for the current cygwin release series, which can
> > be reliably identified by having dll version 1.7.x, while the older frozen
> > releases (dll versions 1.6.x from redhat, 1.5.x open source) still have
> > the
> > older api as no updates are being made for the legacy version(s).
> 
> Ah.  That makes sense, thanks.
> 
> (For the future, if we wanted to diagnose an out-of-date win32api and
> print a helpful message, I guess cygcheck would be the command to use.)

Thank you for the information.   I will update my cygwin installation.

^ permalink raw reply

* Re: [PATCH] git-fast-import(1): reorganise options
From: John Keeping @ 2013-01-06 14:28 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Junio C Hamano, Eric S. Raymond, git, David Michael Barr,
	Pete Wyckoff, Thomas Rast
In-Reply-To: <20130106135109.GF22081@elie.Belkin>

On Sun, Jan 06, 2013 at 05:51:09AM -0800, Jonathan Nieder wrote:
> John Keeping wrote:
>> I left the "Semantics of execution" options with the general options
>> since I couldn't think of a sensible heading
> 
> Neat trick. :)

I took inspiration from git-pull(1), which has a few general options
followed by several "Options related to..." sections.

> [...]
> > -- <8 --
> > The options in git-fast-import(1) are not currently arranged in a
> > logical order, which has caused the '--done' options to be documented
> > twice (commit 3266de10).
> >
> > Rearrange them into logical groups under subheadings.
> 
> Nice description.
> 
> > While doing this, fix the duplicate '--done' documentation by taking the
> > best bits of each.  Also combine the descriptions of '--relative-marks'
> > and '--no-relative-marks' since they make more sense together.
> 
> I'd prefer to keep those as separate patches, if that's manageable.

I'll send a series of three patches if the discussion below seems
reasonable:

[1/3] remove duplicate '--done'
[2/3] combine --[no-]relative-marks
[3/3] reorganize options

> The organization you propose is:
> 
> 	OPTIONS
> 	-------
> 	--quiet
> 	--stats
> 	--force
> 	--cat-blob-fd
> 	--export-pack-edges
> 
> 	Options related to the input stream
> 	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 	--date-format
> 	--done
> 
> 	Options related to marks
> 	~~~~~~~~~~~~~~~~~~~~~~~~
> 	--export-marks
> 	--import-marks
> 	--import-marks-if-exists
> 	--relative-marks
> 	--no-relative-marks
> 
> 	Options for tuning
> 	~~~~~~~~~~~~~~~~~~
> 	--active-branches
> 	--big-file-threshold
> 	--depth
> 	--max-pack-size
> 
> These headings are less cryptic than the ones I proposed, which is a
> nice thing.
> 
> My only nitpicks:
> 
> I'd worry that the catch-all toplevel category would grow larger
> and larger with time, since it's the obvious place to put any new
> option.

I agree that that's a concern, perhaps '--cat-blob-fd' should be
combined with '--date-format' and '--done' into a section called
"Options for frontends" or similar?

And maybe '--export-pack-edges' can move to the performance/compression
tuning section?  I expect the interested audience would be the same.

That only leaves three options in that section, which seems more
reasonable.

> Part of what I tried to do with the proposed categorization was to
> separate options that change the semantics of the import (which one
> uses with "feature" when they are specified in the fast-import stream
> since ignoring them results in a broken import) from options that only
> change superficial aspects of the interface or the details of how the
> resulting packfiles representing the same objects get written.
>
> The phrasing of the name of the category "Options related to the input
> stream" is too broad.  All options relate to the input stream, since
> consuming an input stream and acting on it is all fast-import does.
> Something more specific than "related to" and a mention of "syntax"
> could make it clearer --- how about something like "Input Syntax
> Features"?
> 
> Likewise, lots of functionality is _related_ to marks, but the marks
> options are the options that specify marks files.  I don't know a good
> way to say that --- maybe "Location of Marks Files"?
>
> "Options for Tuning" could also be made more specific --- e.g.,
> "Performance and Compression Tuning".

I realise it's personal taste, but I like the subheadings of the form
"Options (for|related to) ...", so maybe:

Options for input stream features
Options related to marks files
Options for performance and compression tuning

Note that I chose sentence case instead of title case to be consistent
with git-pull(1).

> I like how you put important options like --force on top.  Perhaps
> the less important --quiet and --stats could be split off from that
> into a subsection like "Verbosity" to make them stand out even more.

I quite like having the verbosity options near the top since those are
the ones that are most likely to be of interest to a user, whereas the
rest are likely to be prescribed by the frontend (or only really useful
to frontend authors).


John

^ permalink raw reply

* Re: [PATCH jk/pathspec-literal] t6130-pathspec-noglob: Windows does not allow a file named "f*"
From: Jeff King @ 2013-01-06 14:33 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, Git Mailing List, msysGit
In-Reply-To: <50E9852F.2060005@kdbg.org>

On Sun, Jan 06, 2013 at 03:07:43PM +0100, Johannes Sixt wrote:

> Windows disallows file names that contain a star. Arrange the test setup
> to insert the file name "f*" in the repository without the corresponding
> file in the worktree.
> [...]
> -	test_commit star "f*" &&
> +	# insert file "f*" in the commit, but in a way that avoids
> +	# the name "f*" in the worktree, because it is not allowed
> +	# on Windows (the tests below do not depend on the presence
> +	# of the file in the worktree)
> +	git update-index --add --cacheinfo 100644 "$(git rev-parse HEAD:foo)" "f*" &&
> +	test_tick &&
> +	git commit -m star &&

Thanks, looks obviously correct to me.

Sorry to break Windows again. It seems I learn about a new gotcha with
every patch series. :)

-Peff

^ permalink raw reply

* Re: [PATCH v3 11/19] dir.c: use a single struct exclude_list per source of excludes
From: Adam Spiers @ 2013-01-06 15:20 UTC (permalink / raw)
  To: git list
In-Reply-To: <7v1ue0veww.fsf@alter.siamese.dyndns.org>

On Fri, Jan 04, 2013 at 01:03:59PM -0800, Junio C Hamano wrote:
> Adam Spiers <git@adamspiers.org> writes:
> 
> > diff --git a/builtin/clean.c b/builtin/clean.c
> > index 0c7b3d0..bd18b88 100644
> > --- a/builtin/clean.c
> > +++ b/builtin/clean.c
> > @@ -97,9 +97,10 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
> >  	if (!ignored)
> >  		setup_standard_excludes(&dir);
> >  
> > +	add_exclude_list(&dir, EXC_CMDL);
> >  	for (i = 0; i < exclude_list.nr; i++)
> >  		add_exclude(exclude_list.items[i].string, "", 0,
> > -			    &dir.exclude_list[EXC_CMDL]);
> > +			    &dir.exclude_list_groups[EXC_CMDL].ary[0]);
> 
> This looks somewhat ugly for two reasons.
> 
>  * The abstraction add_exclude() offers to its callers is just to
>    let them add one pattern to the list of patterns for the kind
>    (here, EXC_CMDL); why should they care about .ary[0] part?

Because the caller has to decide which exclude list the new exclude
should be added to; that is how it has been for a long time, and I am
not proposing we change that.

There are currently three callers:

  builtin/clean.c:    cmd_clean()
  builtin/ls-files.c: option_parse_exclude()
  dir.c:              add_excludes_from_file_to_list()

The first two add to EXC_CMDL, but the latter could be adding to
numerous different possible lists, e.g.

    - .git/info/exclude (EXC_FILE)
    - core.excludesfile (EXC_FILE)
    - any of the per-directory .gitignore lists (EXC_DIRS)

>    Are
>    there cases any sane caller (not the implementation of the
>    exclude_list_group machinery e.g. add_excludes_from_... function)
>    may want to call it with .ary[1]?

Effectively yes, although it is not written like ".ary[1]".  For
example prep_exclude() calls add_excludes_from_file_to_list() for each
new .gitignore file

>    Shouldn't the public API function add_exclude() take a pointer to
>    the list group itself?

Typically EXC_DIRS and EXC_FILE will contain excludes from multiple
sources, whereas EXC_CMDL will contain excludes from a single source.

Therefore when transitioning to exclude_list_groups, I had to make a
choice whether to keep EXC_CMDL as a singleton list, or split it out
into a separate field in struct dir_struct, e.g.:

	struct exclude_list exclude_list_cmdl;
	struct exclude_list exclude_list[2];
#define EXC_DIRS 0
#define EXC_FILE 1

I decided it was cleaner to use a singleton list, because this
preserved the design that excludes in earlier members of
exclude_list[3] take priority over excludes in later members of
exclude_list[3].  That way, this loop in last_exclude_matching():

	for (i = EXC_CMDL; i <= EXC_FILE; i++) {

still makes sense.

>  * When naming an array of things, we tend to prefer naming it
> 
>      type thing[count]
> 
>    so that the second element can be called "thing[2]" and not
>    "things[2]".  dir.exclude_list_group[EXC_CMDL] reads beter.

OK, I will change that.

> > diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> > index ef7f99a..c448e06 100644
> > --- a/builtin/ls-files.c
> > +++ b/builtin/ls-files.c
> > @@ -420,10 +420,10 @@ static int option_parse_z(const struct option *opt,
> >  static int option_parse_exclude(const struct option *opt,
> >  				const char *arg, int unset)
> >  {
> > -	struct exclude_list *list = opt->value;
> > +	struct exclude_list_group *group = opt->value;
> >  
> >  	exc_given = 1;
> > -	add_exclude(arg, "", 0, list);
> > +	add_exclude(arg, "", 0, &group->ary[0]);
> 
> This is another example where the caller would wish to be able to say
> 
> 	add_exclude(arg, "", 0, group);
> 
> instead.

Why?  The caller needs to decide which exclude list the exclude should
go on, because that determines matching priority.  Specifying an
exclude_list_group is insufficient information to determine that.

^ permalink raw reply

* [PATCH] archive-zip: write uncompressed size into header even with streaming
From: René Scharfe @ 2013-01-06 15:20 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <7vwqw7mb09.fsf@alter.siamese.dyndns.org>

We record the uncompressed and compressed sizes and the CRC of streamed
files as zero in the local header of the file.  The actual values are
recorded in an extra data descriptor after the file content, and in the
usual ZIP directory entry at the end of the archive.

While we know the compressed size and the CRC only after we processed
the contents, we actually know the uncompressed size right from the
start.  And for files that we store uncompressed we also already know
their final size.

Do it like InfoZIP's zip and recored the known values, even though they
can be reconstructed using the ZIP directory and the data descriptors
alone.  InfoZIP's unzip worked fine before, but NetBSD's version
actually depends on these fields.

The uncompressed size is already set by sha1_object_info().  We just
need to initialize the compressed size to zero or the uncompressed size
depending on the compression method (0 means storing).  The CRC was
propertly initialized already.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
 archive-zip.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/archive-zip.c b/archive-zip.c
index 55f66b4..d3aef53 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -240,7 +240,7 @@ static int write_zip_entry(struct archiver_args *args,
 			(mode & 0111) ? ((mode) << 16) : 0;
 		if (S_ISREG(mode) && args->compression_level != 0 && size > 0)
 			method = 8;
-		compressed_size = size;
+		compressed_size = (method == 0) ? size : 0;
 
 		if (S_ISREG(mode) && type == OBJ_BLOB && !args->convert &&
 		    size > big_file_threshold) {
@@ -313,10 +313,7 @@ static int write_zip_entry(struct archiver_args *args,
 	copy_le16(header.compression_method, method);
 	copy_le16(header.mtime, zip_time);
 	copy_le16(header.mdate, zip_date);
-	if (flags & ZIP_STREAM)
-		set_zip_header_data_desc(&header, 0, 0, 0);
-	else
-		set_zip_header_data_desc(&header, size, compressed_size, crc);
+	set_zip_header_data_desc(&header, size, compressed_size, crc);
 	copy_le16(header.filename_length, pathlen);
 	copy_le16(header.extra_length, ZIP_EXTRA_MTIME_SIZE);
 	write_or_die(1, &header, ZIP_LOCAL_HEADER_SIZE);
-- 
1.7.12

^ permalink raw reply related

* Re: [PATCH v3 11/19] dir.c: use a single struct exclude_list per source of excludes
From: Adam Spiers @ 2013-01-06 15:27 UTC (permalink / raw)
  To: git list
In-Reply-To: <7v623cqd39.fsf@alter.siamese.dyndns.org>

On Fri, Jan 04, 2013 at 11:54:34PM -0800, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> > Adam Spiers <git@adamspiers.org> writes:
> >
> >> diff --git a/builtin/clean.c b/builtin/clean.c
> >> index 0c7b3d0..bd18b88 100644
> >> --- a/builtin/clean.c
> >> +++ b/builtin/clean.c
> >> @@ -97,9 +97,10 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
> >>  	if (!ignored)
> >>  		setup_standard_excludes(&dir);
> >>  
> >> +	add_exclude_list(&dir, EXC_CMDL);
> >>  	for (i = 0; i < exclude_list.nr; i++)
> >>  		add_exclude(exclude_list.items[i].string, "", 0,
> >> -			    &dir.exclude_list[EXC_CMDL]);
> >> +			    &dir.exclude_list_groups[EXC_CMDL].ary[0]);
> >
> > This looks somewhat ugly for two reasons.
> >
> >  * The abstraction add_exclude() offers to its callers is just to
> >    let them add one pattern to the list of patterns for the kind
> >    (here, EXC_CMDL); why should they care about .ary[0] part?  Are
> >    there cases any sane caller (not the implementation of the
> >    exclude_list_group machinery e.g. add_excludes_from_... function)
> >    may want to call it with .ary[1]?  I do not think of any.
> >    Shouldn't the public API function add_exclude() take a pointer to
> >    the list group itself?
> >
> >  * When naming an array of things, we tend to prefer naming it
> >
> >      type thing[count]
> >
> >    so that the second element can be called "thing[2]" and not
> >    "things[2]".  dir.exclude_list_group[EXC_CMDL] reads better.
> 
> Also, "ary[]" is a bad name, even as an implementation detail, for
> two reasons: it is naming it after its type (being an "array") not
> after what it is (if it holds the patterns from the same information
> source, e.g. file, togeter, "src" might be a better name), and it
> uses rather unusual abbreviation (I haven't seen "array" shortened
> to "ary" anywhere else).

OK, well in that case Documentation/technical/api-allocation-growing.txt
needs to be fixed, because I copied it from that.  I would never normally
shorten "array" to "ary" either, but I did it in an attempt to conform
to the stated guidelines.

^ permalink raw reply

* [PATCH] api-allocation-growing.txt: encourage better variable naming
From: Adam Spiers @ 2013-01-06 15:35 UTC (permalink / raw)
  To: git list
In-Reply-To: <20130106152716.GB2396@pacific.linksys.moosehall>

The documentation for the ALLOC_GROW API implicitly encouraged
developers to use "ary" as the variable name for the array which is
dynamically grown.  However "ary" is an unusual abbreviation hardly
used anywhere else in the source tree, and it is also better to name
variables based on their contents not on their type.

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 Documentation/technical/api-allocation-growing.txt | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/api-allocation-growing.txt b/Documentation/technical/api-allocation-growing.txt
index 43dbe09..3894815 100644
--- a/Documentation/technical/api-allocation-growing.txt
+++ b/Documentation/technical/api-allocation-growing.txt
@@ -5,7 +5,9 @@ Dynamically growing an array using realloc() is error prone and boring.
 
 Define your array with:
 
-* a pointer (`ary`) that points at the array, initialized to `NULL`;
+* a pointer (`array`) that points at the array, initialized to `NULL`
+  (although please name the variable based on its contents, not on its
+  type);
 
 * an integer variable (`alloc`) that keeps track of how big the current
   allocation is, initialized to `0`;
@@ -13,22 +15,22 @@ Define your array with:
 * another integer variable (`nr`) to keep track of how many elements the
   array currently has, initialized to `0`.
 
-Then before adding `n`th element to the array, call `ALLOC_GROW(ary, n,
+Then before adding `n`th element to the array, call `ALLOC_GROW(array, n,
 alloc)`.  This ensures that the array can hold at least `n` elements by
 calling `realloc(3)` and adjusting `alloc` variable.
 
 ------------
-sometype *ary;
+sometype *array;
 size_t nr;
 size_t alloc
 
 for (i = 0; i < nr; i++)
-	if (we like ary[i] already)
+	if (we like array[i] already)
 		return;
 
 /* we did not like any existing one, so add one */
-ALLOC_GROW(ary, nr + 1, alloc);
-ary[nr++] = value you like;
+ALLOC_GROW(array, nr + 1, alloc);
+array[nr++] = value you like;
 ------------
 
 You are responsible for updating the `nr` variable.
-- 
1.7.11.7.33.gb8feba5

^ permalink raw reply related

* Re: as/check-ignore (was Re: What's cooking in git.git (Jan 2013, #02; Thu, 3))
From: Adam Spiers @ 2013-01-06 16:17 UTC (permalink / raw)
  To: git
In-Reply-To: <7vobh4tzx3.fsf@alter.siamese.dyndns.org>

On Fri, Jan 04, 2013 at 01:13:12PM -0800, Junio C Hamano wrote:
> Adam Spiers <git@adamspiers.org> writes:
> > On Thu, Jan 3, 2013 at 7:17 PM, Junio C Hamano <gitster@pobox.com> wrote:
> >> * as/check-ignore (2012-12-28) 19 commits
> >>  - Add git-check-ignore sub-command
> >>  - setup.c: document get_pathspec()
> >>  - pathspec.c: extract new validate_path() for reuse
> >>  - pathspec.c: move reusable code from builtin/add.c
> >>  - add.c: remove unused argument from validate_pathspec()
> >>  - add.c: refactor treat_gitlinks()
> >>  - dir.c: provide clear_directory() for reclaiming dir_struct memory
> >>  - dir.c: keep track of where patterns came from
> >>  - dir.c: use a single struct exclude_list per source of excludes
> >>  - dir.c: rename free_excludes() to clear_exclude_list()
> >>  - dir.c: refactor is_path_excluded()
> >>  - dir.c: refactor is_excluded()
> >>  - dir.c: refactor is_excluded_from_list()
> >>  - dir.c: rename excluded() to is_excluded()
> >>  - dir.c: rename excluded_from_list() to is_excluded_from_list()
> >>  - dir.c: rename path_excluded() to is_path_excluded()
> >>  - dir.c: rename cryptic 'which' variable to more consistent name
> >>  - Improve documentation and comments regarding directory traversal API
> >>  - api-directory-listing.txt: update to match code
> >>
> >>  Rerolled.  The early parts looked mostly fine; we may want to split
> >>  this into two topics and have the early half graduate sooner.
> >
> > Sounds good to me.  As already mentioned, I have the v4 series ready
> > and it addresses all issues already voiced in v3, but I have postponed
> > submitting it as per your request.  Please let me know when and how to
> > proceed, thanks!
> 
> I was planning to add a new "as/dir-c-cleanup" topic that leads to
> f619881 (dir.c: rename free_excludes() to clear_exclude_list(),
> 2012-12-27), and leave the remainder in this series.  I think the
> earlier parts of this series up to that point should go 'next' now.

That sounds perfect; thanks.  I'll rebase v4 on top of this and submit.

^ permalink raw reply

* Re: Re: [PATCH] Remove the suggestion to use parsecvs, which is currently broken.
From: Heiko Voigt @ 2013-01-06 16:34 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: git
In-Reply-To: <20121229044200.GA16086@thyrsus.com>

Hi,

On Fri, Dec 28, 2012 at 11:42:00PM -0500, Eric S. Raymond wrote:
> Heiko Voigt <hvoigt@hvoigt.net>:
> > Maybe you could add that information to the parsecvs compile
> > instructions? I think just because it takes some effort to compile does
> > not justify to remove this useful pointer here. When I was converting a
> > legacy cvs repository this pointer would have helped me a lot.
> 
> I'm parsecvs's maintainer now.  It's not in good shape; there is at
> least one other known showstopper besides the build issue.  I would
> strongly prefer to direct peoples' attention away from it until I
> have time to fix it and cut a release.  This is not a distant 
> prospect - two or three weeks out, maybe.

So for this short amount of time you want to change gits documentation?
Is this hint causing you trouble? Are there many people asking for
support because of that?

Even if it takes some effort to get parsecvs running I would like to
keep the hint to a good and proven cvs importer.

> The priority that is between me and fixing parsecvs is getting (a)
> cvsps and git-cvsimport to a non-broken state, and (b) having a sound
> test suite in place so I *know* it's in a non-broken state. As previously
> discussed, I will then apply that test suite to parsecvs.
> 
> Heiko, you can speed up the process by (a) adapting your tests for
> the new cvsps test code,

I had a quick glance at your testsuite. After building cvsps with
	make
and cd'ing into test I got a lot of error messages some saying that
cvsps was not found when issuing
	make
there. It would be great if do not need to install cvsps into my path
just for running the testsuite. 

There is no README so I am not sure how the tests are supposed to be
build in general. Due to the lack of documentation its probably easier
for you Eric to port my tests.

The structure of my tests is quite simple:

	t/  - All the tests
	t/cvsroot - A cvs module per test
	t/t[0-9]{4}*/expect - The expected cvsps output

You can copy the cvs repository modules and convert the expected cvsps
output to whatever output you want to test against. It the found
changeset ordering that is interesting.

> and (b) merging the fix you wrote so cvsps
> would pass the t9603 test.  

The fix was never clean and AFAIR the reason behind that was that the
breakage in commit ordering is not easy to fix in cvsps. That and
because there are other working tools out there was the reason why I
stopped working on fixing cvsps.

Cheers Heiko

^ permalink raw reply

* Re: [PATCH] status: report ignored yet tracked directories
From: Antoine Pelisse @ 2013-01-06 16:40 UTC (permalink / raw)
  To: Jeff King; +Cc: Torsten Bögershausen, git
In-Reply-To: <20130105230303.GA5195@sigill.intra.peff.net>

On Sun, Jan 6, 2013 at 12:03 AM, Jeff King <peff@peff.net> wrote:
> On Sat, Jan 05, 2013 at 09:42:43PM +0100, Antoine Pelisse wrote:
>
>> Tracked directories (i.e. directories containing tracked files) that
>> are ignored must be reported as ignored if they contain untracked files.
>>
>> Currently, tracked files or directories can't be reported untracked or ignored.
>> Remove that constraint when searching ignored files.
>>
>> Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
>> ---
>
> I was expecting to see some explanation of the user-visible bug here. In
> other words, what does this fix, and why does the bug only happen when
> core.ignorecase is set.

I spent a couple of hours trying to understand that issue, and even if
I ended-up with pretty much the same points as you do below, I was not
confident enough to phrase it like you just did.

> Looking at your fix and remembering how the index hashing works, I think
> the answer is that:
>
>   1. This bug only affects directories, because they are the only thing
>      that can be simultaneously "ignored and untracked" and "tracked"
>      (i.e., they have entries of both, and we are using
>      DIR_SHOW_OTHER_DIRECTORIES).
>
>   2. When core.ignorecase is false, the index name hash contains only
>      the file entries, and cache_name_exists returns an exact match. So
>      it doesn't matter if we make an extra check when adding the
>      directory via dir_add_name; we know that it will not be there, and
>      the final check is a no-op.
>
>   3. When core.ignorecase is true, we also store directory entries in
>      the index name hash, and this extra check is harmful; the entry
>      does not really exist in the index, and we still need to add it.

Yes, because of this couple of lines I guess (name-hash.c, hash_index_entry()):

  if (ignore_case)
    hash_index_entry_directories(istate, ce);

> But that makes me wonder. In the ignorecase=false case, I claimed that
> the check in dir_add_name is a no-op for mixed tracked/ignored
> directories. But it is presumably not a no-op for other cases. Your
> patch only turns it off when DIR_SHOW_IGNORED is set. But is it possible
> for us to have DIR_SHOW_IGNORED set, _and_ to pass in a path that exists
> in the index as a regular file?

I don't think so, because of the optimization I added in my previous
patch, in treat_file():

  /*
   * Optimization:
   * Don't spend time on indexed files, they won't be
   * added to the list anyway
   */
  struct cache_entry *ce = index_name_exists(&the_index,
    path->buf, path->len, ignore_case);

It's no longer an optimization but a required step, I will update the comment.

> I think in the normal file case, we'd expect treat_path to just tell us
> that it is handled, and we would not ever call dir_add_name in the first
> place. But what if we have an index entry for a file, but the working
> tree now contains a directory?

The directory is treated as any other untracked directory (it never
matches indexed file because of the trailing /).

> I _think_ we still do not hit this code path in that instance, because
> we will end up in treat_directory, and we will end up checking
> directory_exists_in_index. And I cannot get it to misbehave in practice.
> So I think your fix is correct, but the exact how and why is a bit
> subtle.

Thanks a lot for the help, I will try to come up with a better commit
message now.

> -Peff

^ permalink raw reply

* [PATCH v4 00/11] new git check-ignore sub-command
From: Adam Spiers @ 2013-01-06 16:58 UTC (permalink / raw)
  To: git list
In-Reply-To: <20130106161758.GC2396@pacific.linksys.moosehall>

This is the v4 re-roll of the check-ignore series, which is based on
Junio's as/dir-c-cleanup topic branch f6198812 (dir.c: rename
free_excludes() to clear_exclude_list(), 2012-12-27).  As previously
discussed, the earlier parts of the v3 series seem to be complete and
are progressing to 'next'.

Since v3, I addressed the issue of newly public functions with
unacceptably vague or generic names via the following steps:

  - eliminated extraction from add.c to pathspec.c of two functions
    which did not need to be public (validate_pathspec() and
    treat_gitlinks())

  - edited the series history to create separate commits for
    extraction of reusable code from treat_gitlinks() and
    validate_pathspec() into more carefully named public functions

This should make reviewing easier.

I will summarise the changes in the revised patches since v3 in
between the "---" divider and the diffstat of each individual patch.

This series is also available via the check-ignore-v4 tag in:

    git://github.com/aspiers/git.git

Adam Spiers (11):
  dir.c: use a single struct exclude_list per source of excludes
  dir.c: keep track of where patterns came from
  dir.c: provide clear_directory() for reclaiming dir_struct memory
  dir.c: improve docs for match_pathspec() and match_pathspec_depth()
  add.c: remove unused argument from validate_pathspec()
  add.c: move pathspec matchers into new pathspec.c for reuse
  pathspec.c: rename newly public functions for clarity
  add.c: extract check_path_for_gitlink() from treat_gitlinks() for
    reuse
  add.c: extract new die_if_path_beyond_symlink() for reuse
  setup.c: document get_pathspec()
  add git-check-ignore sub-command

 .gitignore                                        |   1 +
 Documentation/git-check-ignore.txt                |  89 +++
 Documentation/gitignore.txt                       |   6 +-
 Documentation/technical/api-directory-listing.txt |  14 +-
 Makefile                                          |   3 +
 builtin.h                                         |   1 +
 builtin/add.c                                     |  78 +--
 builtin/check-ignore.c                            | 173 ++++++
 builtin/clean.c                                   |   3 +-
 builtin/ls-files.c                                |   9 +-
 command-list.txt                                  |   1 +
 contrib/completion/git-completion.bash            |   1 +
 dir.c                                             | 152 ++++--
 dir.h                                             |  62 ++-
 git.c                                             |   1 +
 pathspec.c                                        | 101 ++++
 pathspec.h                                        |   9 +
 setup.c                                           |  19 +
 t/t0008-ignores.sh                                | 632 ++++++++++++++++++++++
 unpack-trees.c                                    |   2 +-
 20 files changed, 1240 insertions(+), 117 deletions(-)
 create mode 100644 Documentation/git-check-ignore.txt
 create mode 100644 builtin/check-ignore.c
 create mode 100644 pathspec.c
 create mode 100644 pathspec.h
 create mode 100755 t/t0008-ignores.sh

-- 
1.7.11.7.33.gb8feba5

^ permalink raw reply

* [PATCH v4 04/11] dir.c: improve docs for match_pathspec() and match_pathspec_depth()
From: Adam Spiers @ 2013-01-06 16:58 UTC (permalink / raw)
  To: git list
In-Reply-To: <1357491493-11619-1-git-send-email-git@adamspiers.org>

Fix a grammatical issue in the description of these functions, and
make it more obvious how and why seen[] can be reused across multiple
invocations.

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 dir.c | 38 ++++++++++++++++++++++++++------------
 dir.h |  6 ++++++
 2 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/dir.c b/dir.c
index 46f362e..547b83f 100644
--- a/dir.c
+++ b/dir.c
@@ -167,12 +167,19 @@ static int match_one(const char *match, const char *name, int namelen)
 }
 
 /*
- * Given a name and a list of pathspecs, see if the name matches
- * any of the pathspecs.  The caller is also interested in seeing
- * all pathspec matches some names it calls this function with
- * (otherwise the user could have mistyped the unmatched pathspec),
- * and a mark is left in seen[] array for pathspec element that
- * actually matched anything.
+ * Given a name and a list of pathspecs, returns the nature of the
+ * closest (i.e. most specific) match of the name to any of the
+ * pathspecs.
+ *
+ * The caller typically calls this multiple times with the same
+ * pathspec and seen[] array but with different name/namelen
+ * (e.g. entries from the index) and is interested in seeing if and
+ * how each pathspec matches all the names it calls this function
+ * with.  A mark is left in the seen[] array for each pathspec element
+ * indicating the closest type of match that element achieved, so if
+ * seen[n] remains zero after multiple invocations, that means the nth
+ * pathspec did not match any names, which could indicate that the
+ * user mistyped the nth pathspec.
  */
 int match_pathspec(const char **pathspec, const char *name, int namelen,
 		int prefix, char *seen)
@@ -239,12 +246,19 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
 }
 
 /*
- * Given a name and a list of pathspecs, see if the name matches
- * any of the pathspecs.  The caller is also interested in seeing
- * all pathspec matches some names it calls this function with
- * (otherwise the user could have mistyped the unmatched pathspec),
- * and a mark is left in seen[] array for pathspec element that
- * actually matched anything.
+ * Given a name and a list of pathspecs, returns the nature of the
+ * closest (i.e. most specific) match of the name to any of the
+ * pathspecs.
+ *
+ * The caller typically calls this multiple times with the same
+ * pathspec and seen[] array but with different name/namelen
+ * (e.g. entries from the index) and is interested in seeing if and
+ * how each pathspec matches all the names it calls this function
+ * with.  A mark is left in the seen[] array for each pathspec element
+ * indicating the closest type of match that element achieved, so if
+ * seen[n] remains zero after multiple invocations, that means the nth
+ * pathspec did not match any names, which could indicate that the
+ * user mistyped the nth pathspec.
  */
 int match_pathspec_depth(const struct pathspec *ps,
 			 const char *name, int namelen,
diff --git a/dir.h b/dir.h
index dd42a3a..136e838 100644
--- a/dir.h
+++ b/dir.h
@@ -116,6 +116,12 @@ struct dir_struct {
 	char basebuf[PATH_MAX];
 };
 
+/*
+ * The ordering of these constants is significant, with
+ * higher-numbered match types signifying "closer" (i.e. more
+ * specific) matches which will override lower-numbered match types
+ * when populating the seen[] array.
+ */
 #define MATCHED_RECURSIVELY 1
 #define MATCHED_FNMATCH 2
 #define MATCHED_EXACTLY 3
-- 
1.7.11.7.33.gb8feba5

^ permalink raw reply related

* [PATCH v4 02/11] dir.c: keep track of where patterns came from
From: Adam Spiers @ 2013-01-06 16:58 UTC (permalink / raw)
  To: git list
In-Reply-To: <1357491493-11619-1-git-send-email-git@adamspiers.org>

For exclude patterns read in from files, the filename is stored in the
exclude list, and the originating line number is stored in the
individual exclude (counting starting at 1).

For exclude patterns provided on the command line, a string describing
the source of the patterns is stored in the exclude list, and the
sequence number assigned to each exclude pattern is negative, with
counting starting at -1.  So for example the 2nd pattern provided via
--exclude would be numbered -2.  This allows any future consumers of
that data to easily distinguish between exclude patterns from files
vs. from the CLI.

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 builtin/clean.c    |  4 ++--
 builtin/ls-files.c |  5 +++--
 dir.c              | 26 ++++++++++++++++++++------
 dir.h              | 21 +++++++++++++++++++--
 4 files changed, 44 insertions(+), 12 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index dd89737..b098288 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -97,10 +97,10 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 	if (!ignored)
 		setup_standard_excludes(&dir);
 
-	add_exclude_list(&dir, EXC_CMDL);
+	add_exclude_list(&dir, EXC_CMDL, "--exclude option");
 	for (i = 0; i < exclude_list.nr; i++)
 		add_exclude(exclude_list.items[i].string, "", 0,
-			    &dir.exclude_list_group[EXC_CMDL].el[0]);
+			    &dir.exclude_list_group[EXC_CMDL].el[0], -(i+1));
 
 	pathspec = get_pathspec(prefix, argv);
 
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 0ca9d8e..fa9ccb8 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -35,6 +35,7 @@ static int error_unmatch;
 static char *ps_matched;
 static const char *with_tree;
 static int exc_given;
+static int exclude_args;
 
 static const char *tag_cached = "";
 static const char *tag_unmerged = "";
@@ -423,7 +424,7 @@ static int option_parse_exclude(const struct option *opt,
 	struct exclude_list_group *group = opt->value;
 
 	exc_given = 1;
-	add_exclude(arg, "", 0, &group->el[0]);
+	add_exclude(arg, "", 0, &group->el[0], --exclude_args);
 
 	return 0;
 }
@@ -524,7 +525,7 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	if (read_cache() < 0)
 		die("index file corrupt");
 
-	add_exclude_list(&dir, EXC_CMDL);
+	add_exclude_list(&dir, EXC_CMDL, "--exclude option");
 	argc = parse_options(argc, argv, prefix, builtin_ls_files_options,
 			ls_files_usage, 0);
 	if (show_tag || show_valid_bit) {
diff --git a/dir.c b/dir.c
index 3a15cb9..d3f462b 100644
--- a/dir.c
+++ b/dir.c
@@ -349,7 +349,7 @@ void parse_exclude_pattern(const char **pattern,
 }
 
 void add_exclude(const char *string, const char *base,
-		 int baselen, struct exclude_list *el)
+		 int baselen, struct exclude_list *el, int srcpos)
 {
 	struct exclude *x;
 	int patternlen;
@@ -373,8 +373,10 @@ void add_exclude(const char *string, const char *base,
 	x->base = base;
 	x->baselen = baselen;
 	x->flags = flags;
+	x->srcpos = srcpos;
 	ALLOC_GROW(el->excludes, el->nr + 1, el->alloc);
 	el->excludes[el->nr++] = x;
+	x->el = el;
 }
 
 static void *read_skip_worktree_file_from_index(const char *path, size_t *size)
@@ -425,7 +427,7 @@ int add_excludes_from_file_to_list(const char *fname,
 				   int check_index)
 {
 	struct stat st;
-	int fd, i;
+	int fd, i, lineno = 1;
 	size_t size = 0;
 	char *buf, *entry;
 
@@ -467,15 +469,17 @@ int add_excludes_from_file_to_list(const char *fname,
 		if (buf[i] == '\n') {
 			if (entry != buf + i && entry[0] != '#') {
 				buf[i - (i && buf[i-1] == '\r')] = 0;
-				add_exclude(entry, base, baselen, el);
+				add_exclude(entry, base, baselen, el, lineno);
 			}
+			lineno++;
 			entry = buf + i + 1;
 		}
 	}
 	return 0;
 }
 
-struct exclude_list *add_exclude_list(struct dir_struct *dir, int group_type)
+struct exclude_list *add_exclude_list(struct dir_struct *dir,
+				      int group_type, const char *src)
 {
 	struct exclude_list *el;
 	struct exclude_list_group *group;
@@ -484,6 +488,7 @@ struct exclude_list *add_exclude_list(struct dir_struct *dir, int group_type)
 	ALLOC_GROW(group->el, group->nr + 1, group->alloc);
 	el = &group->el[group->nr++];
 	memset(el, 0, sizeof(*el));
+	el->src = src;
 	return el;
 }
 
@@ -493,7 +498,7 @@ struct exclude_list *add_exclude_list(struct dir_struct *dir, int group_type)
 void add_excludes_from_file(struct dir_struct *dir, const char *fname)
 {
 	struct exclude_list *el;
-	el = add_exclude_list(dir, EXC_FILE);
+	el = add_exclude_list(dir, EXC_FILE, fname);
 	if (add_excludes_from_file_to_list(fname, "", 0, el, 0) < 0)
 		die("cannot use %s as an exclude file", fname);
 }
@@ -524,6 +529,7 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 			break;
 		el = &group->el[dir->exclude_stack->exclude_ix];
 		dir->exclude_stack = stk->prev;
+		free((char *)el->src); /* see strdup() below */
 		clear_exclude_list(el);
 		free(stk);
 		group->nr--;
@@ -550,7 +556,15 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 		memcpy(dir->basebuf + current, base + current,
 		       stk->baselen - current);
 		strcpy(dir->basebuf + stk->baselen, dir->exclude_per_dir);
-		el = add_exclude_list(dir, EXC_DIRS);
+		/*
+		 * dir->basebuf gets reused by the traversal, but we
+		 * need fname to remain unchanged to ensure the src
+		 * member of each struct exclude correctly
+		 * back-references its source file.  Other invocations
+		 * of add_exclude_list provide stable strings, so we
+		 * strdup() and free() here in the caller.
+		 */
+		el = add_exclude_list(dir, EXC_DIRS, strdup(dir->basebuf));
 		stk->exclude_ix = group->nr - 1;
 		add_excludes_from_file_to_list(dir->basebuf,
 					       dir->basebuf, stk->baselen,
diff --git a/dir.h b/dir.h
index c4d88db..64c410e 100644
--- a/dir.h
+++ b/dir.h
@@ -25,16 +25,32 @@ struct dir_entry {
 struct exclude_list {
 	int nr;
 	int alloc;
+
 	/* remember pointer to exclude file contents so we can free() */
 	char *filebuf;
 
+	/* origin of list, e.g. path to filename, or descriptive string */
+	const char *src;
+
 	struct exclude {
+		/*
+		 * This allows callers of last_exclude_matching() etc.
+		 * to determine the origin of the matching pattern.
+		 */
+		struct exclude_list *el;
+
 		const char *pattern;
 		int patternlen;
 		int nowildcardlen;
 		const char *base;
 		int baselen;
 		int flags;
+
+		/*
+		 * Counting starts from 1 for line numbers in ignore files,
+		 * and from -1 decrementing for patterns from CLI args.
+		 */
+		int srcpos;
 	} **excludes;
 };
 
@@ -144,13 +160,14 @@ extern struct exclude *last_exclude_matching_path(struct path_exclude_check *, c
 extern int is_path_excluded(struct path_exclude_check *, const char *, int namelen, int *dtype);
 
 
-extern struct exclude_list *add_exclude_list(struct dir_struct *dir, int group_type);
+extern struct exclude_list *add_exclude_list(struct dir_struct *dir,
+					     int group_type, const char *src);
 extern int add_excludes_from_file_to_list(const char *fname, const char *base, int baselen,
 					  struct exclude_list *el, int check_index);
 extern void add_excludes_from_file(struct dir_struct *, const char *fname);
 extern void parse_exclude_pattern(const char **string, int *patternlen, int *flags, int *nowildcardlen);
 extern void add_exclude(const char *string, const char *base,
-			int baselen, struct exclude_list *el);
+			int baselen, struct exclude_list *el, int srcpos);
 extern void clear_exclude_list(struct exclude_list *el);
 extern int file_exists(const char *);
 
-- 
1.7.11.7.33.gb8feba5

^ permalink raw reply related

* [PATCH v4 03/11] dir.c: provide clear_directory() for reclaiming dir_struct memory
From: Adam Spiers @ 2013-01-06 16:58 UTC (permalink / raw)
  To: git list
In-Reply-To: <1357491493-11619-1-git-send-email-git@adamspiers.org>

By the end of a directory traversal, a dir_struct instance will
typically contains pointers to various data structures on the heap.
clear_directory() provides a convenient way to reclaim that memory.

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 Documentation/technical/api-directory-listing.txt |  2 ++
 dir.c                                             | 30 +++++++++++++++++++++++
 dir.h                                             |  1 +
 3 files changed, 33 insertions(+)

diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index fa9c8ae..fbceb62 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -81,4 +81,6 @@ marked. If you to exclude files, make sure you have loaded index first.
 
 * Use `dir.entries[]`.
 
+* Call `free_directory()` when none of the contained elements are no longer in use.
+
 (JC)
diff --git a/dir.c b/dir.c
index d3f462b..46f362e 100644
--- a/dir.c
+++ b/dir.c
@@ -1557,3 +1557,33 @@ void free_pathspec(struct pathspec *pathspec)
 	free(pathspec->items);
 	pathspec->items = NULL;
 }
+
+/*
+ * Frees memory within dir which was allocated for exclude lists and
+ * the exclude_stack.  Does not free dir itself.
+ */
+void clear_directory(struct dir_struct *dir)
+{
+	int i, j;
+	struct exclude_list_group *group;
+	struct exclude_list *el;
+	struct exclude_stack *stk;
+
+	for (i = EXC_CMDL; i <= EXC_FILE; i++) {
+		group = &dir->exclude_list_group[i];
+		for (j = 0; j < group->nr; j++) {
+			el = &group->el[j];
+			if (i == EXC_DIRS)
+				free((char *)el->src);
+			clear_exclude_list(el);
+		}
+		free(group->el);
+	}
+
+	stk = dir->exclude_stack;
+	while (stk) {
+		struct exclude_stack *prev = stk->prev;
+		free(stk);
+		stk = prev;
+	}
+}
diff --git a/dir.h b/dir.h
index 64c410e..dd42a3a 100644
--- a/dir.h
+++ b/dir.h
@@ -169,6 +169,7 @@ extern void parse_exclude_pattern(const char **string, int *patternlen, int *fla
 extern void add_exclude(const char *string, const char *base,
 			int baselen, struct exclude_list *el, int srcpos);
 extern void clear_exclude_list(struct exclude_list *el);
+extern void clear_directory(struct dir_struct *dir);
 extern int file_exists(const char *);
 
 extern int is_inside_dir(const char *dir);
-- 
1.7.11.7.33.gb8feba5

^ permalink raw reply related

* [PATCH v4 10/11] setup.c: document get_pathspec()
From: Adam Spiers @ 2013-01-06 16:58 UTC (permalink / raw)
  To: git list
In-Reply-To: <1357491493-11619-1-git-send-email-git@adamspiers.org>

Since we have just created a new pathspec-handling library, now is a
good time to add some comments explaining get_pathspec().

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
The deprecation warning is new since v3.

 setup.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/setup.c b/setup.c
index 7663a4c..9570147 100644
--- a/setup.c
+++ b/setup.c
@@ -249,6 +249,25 @@ static const char *prefix_pathspec(const char *prefix, int prefixlen, const char
 		return prefix_path(prefix, prefixlen, copyfrom);
 }
 
+/*
+ * N.B. get_pathspec() is deprecated in favor of the "struct pathspec"
+ * based interface - see pathspec_magic above.
+ *
+ * Arguments:
+ *  - prefix - a path relative to the root of the working tree
+ *  - pathspec - a list of paths underneath the prefix path
+ *
+ * Iterates over pathspec, prepending each path with prefix,
+ * and return the resulting list.
+ *
+ * If pathspec is empty, return a singleton list containing prefix.
+ *
+ * If pathspec and prefix are both empty, return an empty list.
+ *
+ * This is typically used by built-in commands such as add.c, in order
+ * to normalize argv arguments provided to the built-in into a list of
+ * paths to process, all relative to the root of the working tree.
+ */
 const char **get_pathspec(const char *prefix, const char **pathspec)
 {
 	const char *entry = *pathspec;
-- 
1.7.11.7.33.gb8feba5

^ permalink raw reply related

* [PATCH v4 05/11] add.c: remove unused argument from validate_pathspec()
From: Adam Spiers @ 2013-01-06 16:58 UTC (permalink / raw)
  To: git list
In-Reply-To: <1357491493-11619-1-git-send-email-git@adamspiers.org>

The 'argc' argument passed to validate_pathspec() was never used.

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 builtin/add.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index c689f37..1f62ba3 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -197,7 +197,7 @@ static void refresh(int verbose, const char **pathspec)
         free(seen);
 }
 
-static const char **validate_pathspec(int argc, const char **argv, const char *prefix)
+static const char **validate_pathspec(const char **argv, const char *prefix)
 {
 	const char **pathspec = get_pathspec(prefix, argv);
 
@@ -248,7 +248,7 @@ int interactive_add(int argc, const char **argv, const char *prefix, int patch)
 	const char **pathspec = NULL;
 
 	if (argc) {
-		pathspec = validate_pathspec(argc, argv, prefix);
+		pathspec = validate_pathspec(argv, prefix);
 		if (!pathspec)
 			return -1;
 	}
@@ -414,7 +414,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		fprintf(stderr, _("Maybe you wanted to say 'git add .'?\n"));
 		return 0;
 	}
-	pathspec = validate_pathspec(argc, argv, prefix);
+	pathspec = validate_pathspec(argv, prefix);
 
 	if (read_cache() < 0)
 		die(_("index file corrupt"));
-- 
1.7.11.7.33.gb8feba5

^ permalink raw reply related

* [PATCH v4 01/11] dir.c: use a single struct exclude_list per source of excludes
From: Adam Spiers @ 2013-01-06 16:58 UTC (permalink / raw)
  To: git list
In-Reply-To: <1357491493-11619-1-git-send-email-git@adamspiers.org>

Previously each exclude_list could potentially contain patterns
from multiple sources.  For example dir->exclude_list[EXC_FILE]
would typically contain patterns from .git/info/exclude and
core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain
patterns from multiple per-directory .gitignore files during
directory traversal (i.e. when dir->exclude_stack was more than
one item deep).

We split these composite exclude_lists up into three groups of
exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each
exclude_list now contains patterns from a single source.  This will
allow us to cleanly track the origin of each pattern simply by adding
a src field to struct exclude_list, rather than to struct exclude,
which would make memory management of the source string tricky in the
EXC_DIRS case where its contents are dynamically generated.

Similarly, by moving the filebuf member from struct exclude_stack to
struct exclude_list, it allows us to track and subsequently free
memory buffers allocated during the parsing of all exclude files,
rather than only tracking buffers allocated for files in the EXC_DIRS
group.

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 Documentation/technical/api-directory-listing.txt | 12 +++--
 builtin/clean.c                                   |  3 +-
 builtin/ls-files.c                                |  8 +--
 dir.c                                             | 64 ++++++++++++++++-------
 dir.h                                             | 36 +++++++++----
 unpack-trees.c                                    |  2 +-
 6 files changed, 86 insertions(+), 39 deletions(-)

diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index 944fc39..fa9c8ae 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -67,11 +67,13 @@ marked. If you to exclude files, make sure you have loaded index first.
 * Prepare `struct dir_struct dir` and clear it with `memset(&dir, 0,
   sizeof(dir))`.
 
-* Call `add_exclude()` to add single exclude pattern,
-  `add_excludes_from_file()` to add patterns from a file
-  (e.g. `.git/info/exclude`), and/or set `dir.exclude_per_dir`.  A
-  short-hand function `setup_standard_excludes()` can be used to set up
-  the standard set of exclude settings.
+* To add single exclude pattern, call `add_exclude_list()` and then
+  `add_exclude()`.
+
+* To add patterns from a file (e.g. `.git/info/exclude`), call
+  `add_excludes_from_file()` , and/or set `dir.exclude_per_dir`.  A
+  short-hand function `setup_standard_excludes()` can be used to set
+  up the standard set of exclude settings.
 
 * Set options described in the Data Structure section above.
 
diff --git a/builtin/clean.c b/builtin/clean.c
index 0c7b3d0..dd89737 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -97,9 +97,10 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 	if (!ignored)
 		setup_standard_excludes(&dir);
 
+	add_exclude_list(&dir, EXC_CMDL);
 	for (i = 0; i < exclude_list.nr; i++)
 		add_exclude(exclude_list.items[i].string, "", 0,
-			    &dir.exclude_list[EXC_CMDL]);
+			    &dir.exclude_list_group[EXC_CMDL].el[0]);
 
 	pathspec = get_pathspec(prefix, argv);
 
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index ef7f99a..0ca9d8e 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -420,10 +420,10 @@ static int option_parse_z(const struct option *opt,
 static int option_parse_exclude(const struct option *opt,
 				const char *arg, int unset)
 {
-	struct exclude_list *list = opt->value;
+	struct exclude_list_group *group = opt->value;
 
 	exc_given = 1;
-	add_exclude(arg, "", 0, list);
+	add_exclude(arg, "", 0, &group->el[0]);
 
 	return 0;
 }
@@ -488,7 +488,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			"show unmerged files in the output"),
 		OPT_BOOLEAN(0, "resolve-undo", &show_resolve_undo,
 			    "show resolve-undo information"),
-		{ OPTION_CALLBACK, 'x', "exclude", &dir.exclude_list[EXC_CMDL], "pattern",
+		{ OPTION_CALLBACK, 'x', "exclude",
+			&dir.exclude_list_group[EXC_CMDL], "pattern",
 			"skip files matching pattern",
 			0, option_parse_exclude },
 		{ OPTION_CALLBACK, 'X', "exclude-from", &dir, "file",
@@ -523,6 +524,7 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 	if (read_cache() < 0)
 		die("index file corrupt");
 
+	add_exclude_list(&dir, EXC_CMDL);
 	argc = parse_options(argc, argv, prefix, builtin_ls_files_options,
 			ls_files_usage, 0);
 	if (show_tag || show_valid_bit) {
diff --git a/dir.c b/dir.c
index 41f141c..3a15cb9 100644
--- a/dir.c
+++ b/dir.c
@@ -411,15 +411,16 @@ void clear_exclude_list(struct exclude_list *el)
 	for (i = 0; i < el->nr; i++)
 		free(el->excludes[i]);
 	free(el->excludes);
+	free(el->filebuf);
 
 	el->nr = 0;
 	el->excludes = NULL;
+	el->filebuf = NULL;
 }
 
 int add_excludes_from_file_to_list(const char *fname,
 				   const char *base,
 				   int baselen,
-				   char **buf_p,
 				   struct exclude_list *el,
 				   int check_index)
 {
@@ -460,8 +461,7 @@ int add_excludes_from_file_to_list(const char *fname,
 		close(fd);
 	}
 
-	if (buf_p)
-		*buf_p = buf;
+	el->filebuf = buf;
 	entry = buf;
 	for (i = 0; i < size; i++) {
 		if (buf[i] == '\n') {
@@ -475,10 +475,26 @@ int add_excludes_from_file_to_list(const char *fname,
 	return 0;
 }
 
+struct exclude_list *add_exclude_list(struct dir_struct *dir, int group_type)
+{
+	struct exclude_list *el;
+	struct exclude_list_group *group;
+
+	group = &dir->exclude_list_group[group_type];
+	ALLOC_GROW(group->el, group->nr + 1, group->alloc);
+	el = &group->el[group->nr++];
+	memset(el, 0, sizeof(*el));
+	return el;
+}
+
+/*
+ * Used to set up core.excludesfile and .git/info/exclude lists.
+ */
 void add_excludes_from_file(struct dir_struct *dir, const char *fname)
 {
-	if (add_excludes_from_file_to_list(fname, "", 0, NULL,
-					   &dir->exclude_list[EXC_FILE], 0) < 0)
+	struct exclude_list *el;
+	el = add_exclude_list(dir, EXC_FILE);
+	if (add_excludes_from_file_to_list(fname, "", 0, el, 0) < 0)
 		die("cannot use %s as an exclude file", fname);
 }
 
@@ -488,6 +504,7 @@ void add_excludes_from_file(struct dir_struct *dir, const char *fname)
  */
 static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 {
+	struct exclude_list_group *group;
 	struct exclude_list *el;
 	struct exclude_stack *stk = NULL;
 	int current;
@@ -496,17 +513,20 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 	    (baselen + strlen(dir->exclude_per_dir) >= PATH_MAX))
 		return; /* too long a path -- ignore */
 
-	/* Pop the directories that are not the prefix of the path being checked. */
-	el = &dir->exclude_list[EXC_DIRS];
+	group = &dir->exclude_list_group[EXC_DIRS];
+
+	/* Pop the exclude lists from the EXCL_DIRS exclude_list_group
+	 * which originate from directories not in the prefix of the
+	 * path being checked. */
 	while ((stk = dir->exclude_stack) != NULL) {
 		if (stk->baselen <= baselen &&
 		    !strncmp(dir->basebuf, base, stk->baselen))
 			break;
+		el = &group->el[dir->exclude_stack->exclude_ix];
 		dir->exclude_stack = stk->prev;
-		while (stk->exclude_ix < el->nr)
-			free(el->excludes[--el->nr]);
-		free(stk->filebuf);
+		clear_exclude_list(el);
 		free(stk);
+		group->nr--;
 	}
 
 	/* Read from the parent directories and push them down. */
@@ -527,13 +547,14 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 		}
 		stk->prev = dir->exclude_stack;
 		stk->baselen = cp - base;
-		stk->exclude_ix = el->nr;
 		memcpy(dir->basebuf + current, base + current,
 		       stk->baselen - current);
 		strcpy(dir->basebuf + stk->baselen, dir->exclude_per_dir);
+		el = add_exclude_list(dir, EXC_DIRS);
+		stk->exclude_ix = group->nr - 1;
 		add_excludes_from_file_to_list(dir->basebuf,
 					       dir->basebuf, stk->baselen,
-					       &stk->filebuf, el, 1);
+					       el, 1);
 		dir->exclude_stack = stk;
 		current = stk->baselen;
 	}
@@ -679,18 +700,23 @@ static struct exclude *last_exclude_matching(struct dir_struct *dir,
 					     int *dtype_p)
 {
 	int pathlen = strlen(pathname);
-	int st;
+	int i, j;
+	struct exclude_list_group *group;
 	struct exclude *exclude;
 	const char *basename = strrchr(pathname, '/');
 	basename = (basename) ? basename+1 : pathname;
 
 	prep_exclude(dir, pathname, basename-pathname);
-	for (st = EXC_CMDL; st <= EXC_FILE; st++) {
-		exclude = last_exclude_matching_from_list(
-			pathname, pathlen, basename, dtype_p,
-			&dir->exclude_list[st]);
-		if (exclude)
-			return exclude;
+
+	for (i = EXC_CMDL; i <= EXC_FILE; i++) {
+		group = &dir->exclude_list_group[i];
+		for (j = group->nr - 1; j >= 0; j--) {
+			exclude = last_exclude_matching_from_list(
+				pathname, pathlen, basename, dtype_p,
+				&group->el[j]);
+			if (exclude)
+				return exclude;
+		}
 	}
 	return NULL;
 }
diff --git a/dir.h b/dir.h
index 5664ba8..c4d88db 100644
--- a/dir.h
+++ b/dir.h
@@ -16,14 +16,18 @@ struct dir_entry {
 #define EXC_FLAG_NEGATIVE 16
 
 /*
- * Each .gitignore file will be parsed into patterns which are then
- * appended to the relevant exclude_list (either EXC_DIRS or
- * EXC_FILE).  exclude_lists are also used to represent the list of
- * --exclude values passed via CLI args (EXC_CMDL).
+ * Each excludes file will be parsed into a fresh exclude_list which
+ * is appended to the relevant exclude_list_group (either EXC_DIRS or
+ * EXC_FILE).  An exclude_list within the EXC_CMDL exclude_list_group
+ * can also be used to represent the list of --exclude values passed
+ * via CLI args.
  */
 struct exclude_list {
 	int nr;
 	int alloc;
+	/* remember pointer to exclude file contents so we can free() */
+	char *filebuf;
+
 	struct exclude {
 		const char *pattern;
 		int patternlen;
@@ -42,9 +46,13 @@ struct exclude_list {
  */
 struct exclude_stack {
 	struct exclude_stack *prev; /* the struct exclude_stack for the parent directory */
-	char *filebuf; /* remember pointer to per-directory exclude file contents so we can free() */
 	int baselen;
-	int exclude_ix;
+	int exclude_ix; /* index of exclude_list within EXC_DIRS exclude_list_group */
+};
+
+struct exclude_list_group {
+	int nr, alloc;
+	struct exclude_list *el;
 };
 
 struct dir_struct {
@@ -62,16 +70,23 @@ struct dir_struct {
 
 	/* Exclude info */
 	const char *exclude_per_dir;
-	struct exclude_list exclude_list[3];
+
 	/*
-	 * We maintain three exclude pattern lists:
+	 * We maintain three groups of exclude pattern lists:
+	 *
 	 * EXC_CMDL lists patterns explicitly given on the command line.
 	 * EXC_DIRS lists patterns obtained from per-directory ignore files.
-	 * EXC_FILE lists patterns from fallback ignore files.
+	 * EXC_FILE lists patterns from fallback ignore files, e.g.
+	 *   - .git/info/exclude
+	 *   - core.excludesfile
+	 *
+	 * Each group contains multiple exclude lists, a single list
+	 * per source.
 	 */
 #define EXC_CMDL 0
 #define EXC_DIRS 1
 #define EXC_FILE 2
+	struct exclude_list_group exclude_list_group[3];
 
 	/*
 	 * Temporary variables which are used during loading of the
@@ -129,8 +144,9 @@ extern struct exclude *last_exclude_matching_path(struct path_exclude_check *, c
 extern int is_path_excluded(struct path_exclude_check *, const char *, int namelen, int *dtype);
 
 
+extern struct exclude_list *add_exclude_list(struct dir_struct *dir, int group_type);
 extern int add_excludes_from_file_to_list(const char *fname, const char *base, int baselen,
-					  char **buf_p, struct exclude_list *el, int check_index);
+					  struct exclude_list *el, int check_index);
 extern void add_excludes_from_file(struct dir_struct *, const char *fname);
 extern void parse_exclude_pattern(const char **string, int *patternlen, int *flags, int *nowildcardlen);
 extern void add_exclude(const char *string, const char *base,
diff --git a/unpack-trees.c b/unpack-trees.c
index ad621d9..de8da46 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1019,7 +1019,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	if (!core_apply_sparse_checkout || !o->update)
 		o->skip_sparse_checkout = 1;
 	if (!o->skip_sparse_checkout) {
-		if (add_excludes_from_file_to_list(git_path("info/sparse-checkout"), "", 0, NULL, &el, 0) < 0)
+		if (add_excludes_from_file_to_list(git_path("info/sparse-checkout"), "", 0, &el, 0) < 0)
 			o->skip_sparse_checkout = 1;
 		else
 			o->el = &el;
-- 
1.7.11.7.33.gb8feba5

^ permalink raw reply related

* [PATCH v4 09/11] add.c: extract new die_if_path_beyond_symlink() for reuse
From: Adam Spiers @ 2013-01-06 16:58 UTC (permalink / raw)
  To: git list
In-Reply-To: <1357491493-11619-1-git-send-email-git@adamspiers.org>

This will be reused by a new git check-ignore command.

Also document validate_pathspec().

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
Unlike v3, this series doesn't make validate_pathspec() public.

 builtin/add.c | 10 ++++++----
 pathspec.c    | 12 ++++++++++++
 pathspec.h    |  1 +
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index f95ded2..3716617 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -153,6 +153,11 @@ static void refresh(int verbose, const char **pathspec)
         free(seen);
 }
 
+/*
+ * Normalizes argv relative to prefix, via get_pathspec(), and then
+ * runs die_if_path_beyond_symlink() on each path in the normalized
+ * list.
+ */
 static const char **validate_pathspec(const char **argv, const char *prefix)
 {
 	const char **pathspec = get_pathspec(prefix, argv);
@@ -160,10 +165,7 @@ static const char **validate_pathspec(const char **argv, const char *prefix)
 	if (pathspec) {
 		const char **p;
 		for (p = pathspec; *p; p++) {
-			if (has_symlink_leading_path(*p, strlen(*p))) {
-				int len = prefix ? strlen(prefix) : 0;
-				die(_("'%s' is beyond a symbolic link"), *p + len);
-			}
+			die_if_path_beyond_symlink(*p, prefix);
 		}
 	}
 
diff --git a/pathspec.c b/pathspec.c
index 02d3344..284f397 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -87,3 +87,15 @@ const char *check_path_for_gitlink(const char *path)
 	}
 	return path;
 }
+
+/*
+ * Dies if the given path refers to a file inside a symlinked
+ * directory in the index.
+ */
+void die_if_path_beyond_symlink(const char *path, const char *prefix)
+{
+	if (has_symlink_leading_path(path, strlen(path))) {
+		int len = prefix ? strlen(prefix) : 0;
+		die(_("'%s' is beyond a symbolic link"), path + len);
+	}
+}
diff --git a/pathspec.h b/pathspec.h
index bf8eb96..db0184a 100644
--- a/pathspec.h
+++ b/pathspec.h
@@ -4,5 +4,6 @@
 extern char *find_pathspecs_matching_against_index(const char **pathspec);
 extern void add_pathspec_matches_against_index(const char **pathspec, char *seen, int specs);
 extern const char *check_path_for_gitlink(const char *path);
+extern void die_if_path_beyond_symlink(const char *path, const char *prefix);
 
 #endif /* PATHSPEC_H */
-- 
1.7.11.7.33.gb8feba5

^ permalink raw reply related

* [PATCH v4 11/11] add git-check-ignore sub-command
From: Adam Spiers @ 2013-01-06 16:58 UTC (permalink / raw)
  To: git list
In-Reply-To: <1357491493-11619-1-git-send-email-git@adamspiers.org>

This works in a similar manner to git-check-attr.

Thanks to Jeff King and Junio C Hamano for the idea:
http://thread.gmane.org/gmane.comp.version-control.git/108671/focus=108815

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
Several minor improvements since v3:

  - rename char *dir to slash
  - fix some declaration-after-statement violations
  - correctly handle and test the case where --stdin is used
    but STDIN is empty
  - test that files in the index are not listed as ignored
  - test that the presence of a file in the working directory
    doesn't impact the result of running check-ignore on that file
  - stylistic tweaks
  - drop the -q option to grep in the test suite
  - fix potential brittleness with future tests which could call
    test_expect_success_multi with parameters containing double-quotes

 .gitignore                                        |   1 +
 Documentation/git-check-ignore.txt                |  89 +++
 Documentation/gitignore.txt                       |   6 +-
 Documentation/technical/api-directory-listing.txt |   2 +-
 Makefile                                          |   1 +
 builtin.h                                         |   1 +
 builtin/check-ignore.c                            | 173 ++++++
 command-list.txt                                  |   1 +
 contrib/completion/git-completion.bash            |   1 +
 git.c                                             |   1 +
 t/t0008-ignores.sh                                | 632 ++++++++++++++++++++++
 11 files changed, 905 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/git-check-ignore.txt
 create mode 100644 builtin/check-ignore.c
 create mode 100755 t/t0008-ignores.sh

diff --git a/.gitignore b/.gitignore
index f1acd3e..20ef4e8 100644
--- a/.gitignore
+++ b/.gitignore
@@ -19,6 +19,7 @@
 /git-bundle
 /git-cat-file
 /git-check-attr
+/git-check-ignore
 /git-check-ref-format
 /git-checkout
 /git-checkout-index
diff --git a/Documentation/git-check-ignore.txt b/Documentation/git-check-ignore.txt
new file mode 100644
index 0000000..854e4d0
--- /dev/null
+++ b/Documentation/git-check-ignore.txt
@@ -0,0 +1,89 @@
+git-check-ignore(1)
+===================
+
+NAME
+----
+git-check-ignore - Debug gitignore / exclude files
+
+
+SYNOPSIS
+--------
+[verse]
+'git check-ignore' [options] pathname...
+'git check-ignore' [options] --stdin < <list-of-paths>
+
+DESCRIPTION
+-----------
+
+For each pathname given via the command-line or from a file via
+`--stdin`, show the pattern from .gitignore (or other input files to
+the exclude mechanism) that decides if the pathname is excluded or
+included.  Later patterns within a file take precedence over earlier
+ones.
+
+OPTIONS
+-------
+-q, --quiet::
+	Don't output anything, just set exit status.  This is only
+	valid with a single pathname.
+
+-v, --verbose::
+	Also output details about the matching pattern (if any)
+	for each given pathname.
+
+--stdin::
+	Read file names from stdin instead of from the command-line.
+
+-z::
+	The output format is modified to be machine-parseable (see
+	below).  If `--stdin` is also given, input paths are separated
+	with a NUL character instead of a linefeed character.
+
+OUTPUT
+------
+
+By default, any of the given pathnames which match an ignore pattern
+will be output, one per line.  If no pattern matches a given path,
+nothing will be output for that path; this means that path will not be
+ignored.
+
+If `--verbose` is specified, the output is a series of lines of the form:
+
+<source> <COLON> <linenum> <COLON> <pattern> <HT> <pathname>
+
+<pathname> is the path of a file being queried, <pattern> is the
+matching pattern, <source> is the pattern's source file, and <linenum>
+is the line number of the pattern within that source.  If the pattern
+contained a `!` prefix or `/` suffix, it will be preserved in the
+output.  <source> will be an absolute path when referring to the file
+configured by `core.excludesfile`, or relative to the repository root
+when referring to `.git/info/exclude` or a per-directory exclude file.
+
+If `-z` is specified, the pathnames in the output are delimited by the
+null character; if `--verbose` is also specified then null characters
+are also used instead of colons and hard tabs:
+
+<source> <NULL> <linenum> <NULL> <pattern> <NULL> <pathname> <NULL>
+
+
+EXIT STATUS
+-----------
+
+0::
+	One or more of the provided paths is ignored.
+
+1::
+	None of the provided paths are ignored.
+
+128::
+	A fatal error was encountered.
+
+SEE ALSO
+--------
+linkgit:gitignore[5]
+linkgit:gitconfig[5]
+linkgit:git-ls-files[5]
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
index 2e7328b..f401b8c 100644
--- a/Documentation/gitignore.txt
+++ b/Documentation/gitignore.txt
@@ -153,8 +153,10 @@ The second .gitignore prevents git from ignoring
 
 SEE ALSO
 --------
-linkgit:git-rm[1], linkgit:git-update-index[1],
-linkgit:gitrepository-layout[5]
+linkgit:git-rm[1],
+linkgit:git-update-index[1],
+linkgit:gitrepository-layout[5],
+linkgit:git-check-ignore[1]
 
 GIT
 ---
diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index fbceb62..9d3e352 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -81,6 +81,6 @@ marked. If you to exclude files, make sure you have loaded index first.
 
 * Use `dir.entries[]`.
 
-* Call `free_directory()` when none of the contained elements are no longer in use.
+* Call `clear_directory()` when none of the contained elements are no longer in use.
 
 (JC)
diff --git a/Makefile b/Makefile
index 48facad..8476fc8 100644
--- a/Makefile
+++ b/Makefile
@@ -822,6 +822,7 @@ BUILTIN_OBJS += builtin/branch.o
 BUILTIN_OBJS += builtin/bundle.o
 BUILTIN_OBJS += builtin/cat-file.o
 BUILTIN_OBJS += builtin/check-attr.o
+BUILTIN_OBJS += builtin/check-ignore.o
 BUILTIN_OBJS += builtin/check-ref-format.o
 BUILTIN_OBJS += builtin/checkout-index.o
 BUILTIN_OBJS += builtin/checkout.o
diff --git a/builtin.h b/builtin.h
index dffb34e..d57faf4 100644
--- a/builtin.h
+++ b/builtin.h
@@ -58,6 +58,7 @@ extern int cmd_cat_file(int argc, const char **argv, const char *prefix);
 extern int cmd_checkout(int argc, const char **argv, const char *prefix);
 extern int cmd_checkout_index(int argc, const char **argv, const char *prefix);
 extern int cmd_check_attr(int argc, const char **argv, const char *prefix);
+extern int cmd_check_ignore(int argc, const char **argv, const char *prefix);
 extern int cmd_check_ref_format(int argc, const char **argv, const char *prefix);
 extern int cmd_cherry(int argc, const char **argv, const char *prefix);
 extern int cmd_cherry_pick(int argc, const char **argv, const char *prefix);
diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
new file mode 100644
index 0000000..709535c
--- /dev/null
+++ b/builtin/check-ignore.c
@@ -0,0 +1,173 @@
+#include "builtin.h"
+#include "cache.h"
+#include "dir.h"
+#include "quote.h"
+#include "pathspec.h"
+#include "parse-options.h"
+
+static int quiet, verbose, stdin_paths;
+static const char * const check_ignore_usage[] = {
+"git check-ignore [options] pathname...",
+"git check-ignore [options] --stdin < <list-of-paths>",
+NULL
+};
+
+static int null_term_line;
+
+static const struct option check_ignore_options[] = {
+	OPT__QUIET(&quiet, N_("suppress progress reporting")),
+	OPT__VERBOSE(&verbose, N_("be verbose")),
+	OPT_GROUP(""),
+	OPT_BOOLEAN(0, "stdin", &stdin_paths,
+		    N_("read file names from stdin")),
+	OPT_BOOLEAN('z', NULL, &null_term_line,
+		    N_("input paths are terminated by a null character")),
+	OPT_END()
+};
+
+static void output_exclude(const char *path, struct exclude *exclude)
+{
+	char *bang  = exclude->flags & EXC_FLAG_NEGATIVE  ? "!" : "";
+	char *slash = exclude->flags & EXC_FLAG_MUSTBEDIR ? "/" : "";
+	if (!null_term_line) {
+		if (!verbose) {
+			write_name_quoted(path, stdout, '\n');
+		} else {
+			quote_c_style(exclude->el->src, NULL, stdout, 0);
+			printf(":%d:%s%s%s\t",
+			       exclude->srcpos,
+			       bang, exclude->pattern, slash);
+			quote_c_style(path, NULL, stdout, 0);
+			fputc('\n', stdout);
+		}
+	} else {
+		if (!verbose) {
+			printf("%s%c", path, '\0');
+		} else {
+			printf("%s%c%d%c%s%s%s%c%s%c",
+			       exclude->el->src, '\0',
+			       exclude->srcpos, '\0',
+			       bang, exclude->pattern, slash, '\0',
+			       path, '\0');
+		}
+	}
+}
+
+static int check_ignore(const char *prefix, const char **pathspec)
+{
+	struct dir_struct dir;
+	const char *path, *full_path;
+	char *seen;
+	int num_ignored = 0, dtype = DT_UNKNOWN, i;
+	struct path_exclude_check check;
+	struct exclude *exclude;
+
+	/* read_cache() is only necessary so we can watch out for submodules. */
+	if (read_cache() < 0)
+		die(_("index file corrupt"));
+
+	memset(&dir, 0, sizeof(dir));
+	dir.flags |= DIR_COLLECT_IGNORED;
+	setup_standard_excludes(&dir);
+
+	if (!pathspec || !*pathspec) {
+		if (!quiet)
+			fprintf(stderr, "no pathspec given.\n");
+		return 0;
+	}
+
+	path_exclude_check_init(&check, &dir);
+	/*
+	 * look for pathspecs matching entries in the index, since these
+	 * should not be ignored, in order to be consistent with
+	 * 'git status', 'git add' etc.
+	 */
+	seen = find_pathspecs_matching_against_index(pathspec);
+	for (i = 0; pathspec[i]; i++) {
+		path = pathspec[i];
+		full_path = prefix_path(prefix, prefix
+					? strlen(prefix) : 0, path);
+		full_path = check_path_for_gitlink(full_path);
+		die_if_path_beyond_symlink(full_path, prefix);
+		if (!seen[i] && path[0]) {
+			exclude = last_exclude_matching_path(&check, full_path,
+							     -1, &dtype);
+			if (exclude) {
+				if (!quiet)
+					output_exclude(path, exclude);
+				num_ignored++;
+			}
+		}
+	}
+	free(seen);
+	clear_directory(&dir);
+	path_exclude_check_clear(&check);
+
+	return num_ignored;
+}
+
+static int check_ignore_stdin_paths(const char *prefix)
+{
+	struct strbuf buf, nbuf;
+	char **pathspec = NULL;
+	size_t nr = 0, alloc = 0;
+	int line_termination = null_term_line ? 0 : '\n';
+	int num_ignored;
+
+	strbuf_init(&buf, 0);
+	strbuf_init(&nbuf, 0);
+	while (strbuf_getline(&buf, stdin, line_termination) != EOF) {
+		if (line_termination && buf.buf[0] == '"') {
+			strbuf_reset(&nbuf);
+			if (unquote_c_style(&nbuf, buf.buf, NULL))
+				die("line is badly quoted");
+			strbuf_swap(&buf, &nbuf);
+		}
+		ALLOC_GROW(pathspec, nr + 1, alloc);
+		pathspec[nr] = xcalloc(strlen(buf.buf) + 1, sizeof(*buf.buf));
+		strcpy(pathspec[nr++], buf.buf);
+	}
+	ALLOC_GROW(pathspec, nr + 1, alloc);
+	pathspec[nr] = NULL;
+	num_ignored = check_ignore(prefix, (const char **)pathspec);
+	maybe_flush_or_die(stdout, "attribute to stdout");
+	strbuf_release(&buf);
+	strbuf_release(&nbuf);
+	free(pathspec);
+	return num_ignored;
+}
+
+int cmd_check_ignore(int argc, const char **argv, const char *prefix)
+{
+	int num_ignored;
+
+	git_config(git_default_config, NULL);
+
+	argc = parse_options(argc, argv, prefix, check_ignore_options,
+			     check_ignore_usage, 0);
+
+	if (stdin_paths) {
+		if (argc > 0)
+			die(_("cannot specify pathnames with --stdin"));
+	} else {
+		if (null_term_line)
+			die(_("-z only makes sense with --stdin"));
+		if (argc == 0)
+			die(_("no path specified"));
+	}
+	if (quiet) {
+		if (argc > 1)
+			die(_("--quiet is only valid with a single pathname"));
+		if (verbose)
+			die(_("cannot have both --quiet and --verbose"));
+	}
+
+	if (stdin_paths) {
+		num_ignored = check_ignore_stdin_paths(prefix);
+	} else {
+		num_ignored = check_ignore(prefix, argv);
+		maybe_flush_or_die(stdout, "ignore to stdout");
+	}
+
+	return !num_ignored;
+}
diff --git a/command-list.txt b/command-list.txt
index 14ea67a..ef7f39c 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -12,6 +12,7 @@ git-branch                              mainporcelain common
 git-bundle                              mainporcelain
 git-cat-file                            plumbinginterrogators
 git-check-attr                          purehelpers
+git-check-ignore                        purehelpers
 git-checkout                            mainporcelain common
 git-checkout-index                      plumbingmanipulators
 git-check-ref-format                    purehelpers
diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index 2e1b5e1..1fb896b 100755
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -842,6 +842,7 @@ __git_list_porcelain_commands ()
 		archimport)       : import;;
 		cat-file)         : plumbing;;
 		check-attr)       : plumbing;;
+		check-ignore)     : plumbing;;
 		check-ref-format) : plumbing;;
 		checkout-index)   : plumbing;;
 		commit-tree)      : plumbing;;
diff --git a/git.c b/git.c
index d232de9..0b31e66 100644
--- a/git.c
+++ b/git.c
@@ -340,6 +340,7 @@ static void handle_internal_command(int argc, const char **argv)
 		{ "bundle", cmd_bundle, RUN_SETUP_GENTLY },
 		{ "cat-file", cmd_cat_file, RUN_SETUP },
 		{ "check-attr", cmd_check_attr, RUN_SETUP },
+		{ "check-ignore", cmd_check_ignore, RUN_SETUP | NEED_WORK_TREE },
 		{ "check-ref-format", cmd_check_ref_format },
 		{ "checkout", cmd_checkout, RUN_SETUP | NEED_WORK_TREE },
 		{ "checkout-index", cmd_checkout_index,
diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
new file mode 100755
index 0000000..9b0fcd6
--- /dev/null
+++ b/t/t0008-ignores.sh
@@ -0,0 +1,632 @@
+#!/bin/sh
+
+test_description=check-ignore
+
+. ./test-lib.sh
+
+init_vars () {
+	global_excludes="$(pwd)/global-excludes"
+}
+
+enable_global_excludes () {
+	init_vars &&
+	git config core.excludesfile "$global_excludes"
+}
+
+expect_in () {
+	dest="$HOME/expected-$1" text="$2"
+	if test -z "$text"
+	then
+		>"$dest" # avoid newline
+	else
+		echo "$text" >"$dest"
+	fi
+}
+
+expect () {
+	expect_in stdout "$1"
+}
+
+expect_from_stdin () {
+	cat >"$HOME/expected-stdout"
+}
+
+test_stderr () {
+	expected="$1"
+	expect_in stderr "$1" &&
+	test_cmp "$HOME/expected-stderr" "$HOME/stderr"
+}
+
+stderr_contains () {
+	regexp="$1"
+	if grep "$regexp" "$HOME/stderr"
+	then
+		return 0
+	else
+		echo "didn't find /$regexp/ in $HOME/stderr"
+		cat "$HOME/stderr"
+		return 1
+	fi
+}
+
+stderr_empty_on_success () {
+	expect_code="$1"
+	if test $expect_code = 0
+	then
+		test_stderr ""
+	else
+		# If we expect failure then stderr might or might not be empty
+		# due to --quiet - the caller can check its contents
+		return 0
+	fi
+}
+
+test_check_ignore () {
+	args="$1" expect_code="${2:-0}" global_args="$3"
+
+	init_vars &&
+	rm -f "$HOME/stdout" "$HOME/stderr" "$HOME/cmd" &&
+	echo git $global_args check-ignore $quiet_opt $verbose_opt $args \
+		>"$HOME/cmd" &&
+	test_expect_code "$expect_code" \
+		git $global_args check-ignore $quiet_opt $verbose_opt $args \
+		>"$HOME/stdout" 2>"$HOME/stderr" &&
+	test_cmp "$HOME/expected-stdout" "$HOME/stdout" &&
+	stderr_empty_on_success "$expect_code"
+}
+
+test_expect_success_multi () {
+	prereq=
+	if test $# -eq 4
+	then
+		prereq=$1
+		shift
+	fi
+	testname="$1" expect_verbose="$2" code="$3"
+
+	expect=$( echo "$expect_verbose" | sed -e 's/.*	//' )
+
+	test_expect_success $prereq "$testname" '
+		expect "$expect" &&
+		eval "$code"
+	'
+
+	for quiet_opt in '-q' '--quiet'
+	do
+		test_expect_success $prereq "$testname${quiet_opt:+ with $quiet_opt}" "
+			expect '' &&
+			$code
+		"
+	done
+	quiet_opt=
+
+	for verbose_opt in '-v' '--verbose'
+	do
+		test_expect_success $prereq "$testname${verbose_opt:+ with $verbose_opt}" "
+			expect '$expect_verbose' &&
+			$code
+		"
+	done
+	verbose_opt=
+}
+
+test_expect_success 'setup' '
+	init_vars &&
+	mkdir -p a/b/ignored-dir a/submodule b &&
+	if test_have_prereq SYMLINKS
+	then
+		ln -s b a/symlink
+	fi &&
+	(
+		cd a/submodule &&
+		git init &&
+		echo a >a &&
+		git add a &&
+		git commit -m"commit in submodule"
+	) &&
+	git add a/submodule &&
+	cat <<-\EOF >.gitignore &&
+		one
+		ignored-*
+	EOF
+	touch {,a/}{not-ignored,ignored-{and-untracked,but-in-index}} &&
+	git add -f {,a/}ignored-but-in-index
+	cat <<-\EOF >a/.gitignore &&
+		two*
+		*three
+	EOF
+	cat <<-\EOF >a/b/.gitignore &&
+		four
+		five
+		# this comment should affect the line numbers
+		six
+		ignored-dir/
+		# and so should this blank line:
+
+		!on*
+		!two
+	EOF
+	echo "seven" >a/b/ignored-dir/.gitignore &&
+	test -n "$HOME" &&
+	cat <<-\EOF >"$global_excludes" &&
+		globalone
+		!globaltwo
+		globalthree
+	EOF
+	cat <<-\EOF >>.git/info/exclude
+		per-repo
+	EOF
+'
+
+############################################################################
+#
+# test invalid inputs
+
+test_expect_success_multi 'empty command line' '' '
+	test_check_ignore "" 128 &&
+	stderr_contains "fatal: no path specified"
+'
+
+test_expect_success_multi '--stdin with empty STDIN' '' '
+	test_check_ignore "--stdin" 1 </dev/null &&
+	if test -n "$quiet_opt"; then
+		test_stderr ""
+	else
+		test_stderr "no pathspec given."
+	fi
+'
+
+test_expect_success '-q with multiple args' '
+	expect "" &&
+	test_check_ignore "-q one two" 128 &&
+	stderr_contains "fatal: --quiet is only valid with a single pathname"
+'
+
+test_expect_success '--quiet with multiple args' '
+	expect "" &&
+	test_check_ignore "--quiet one two" 128 &&
+	stderr_contains "fatal: --quiet is only valid with a single pathname"
+'
+
+for verbose_opt in '-v' '--verbose'
+do
+	for quiet_opt in '-q' '--quiet'
+	do
+		test_expect_success "$quiet_opt $verbose_opt" "
+			expect '' &&
+			test_check_ignore '$quiet_opt $verbose_opt foo' 128 &&
+			stderr_contains 'fatal: cannot have both --quiet and --verbose'
+		"
+	done
+done
+
+test_expect_success '--quiet with multiple args' '
+	expect "" &&
+	test_check_ignore "--quiet one two" 128 &&
+	stderr_contains "fatal: --quiet is only valid with a single pathname"
+'
+
+test_expect_success_multi 'erroneous use of --' '' '
+	test_check_ignore "--" 128 &&
+	stderr_contains "fatal: no path specified"
+'
+
+test_expect_success_multi '--stdin with superfluous arg' '' '
+	test_check_ignore "--stdin foo" 128 &&
+	stderr_contains "fatal: cannot specify pathnames with --stdin"
+'
+
+test_expect_success_multi '--stdin -z with superfluous arg' '' '
+	test_check_ignore "--stdin -z foo" 128 &&
+	stderr_contains "fatal: cannot specify pathnames with --stdin"
+'
+
+test_expect_success_multi '-z without --stdin' '' '
+	test_check_ignore "-z" 128 &&
+	stderr_contains "fatal: -z only makes sense with --stdin"
+'
+
+test_expect_success_multi '-z without --stdin and superfluous arg' '' '
+	test_check_ignore "-z foo" 128 &&
+	stderr_contains "fatal: -z only makes sense with --stdin"
+'
+
+test_expect_success_multi 'needs work tree' '' '
+	(
+		cd .git &&
+		test_check_ignore "foo" 128
+	) &&
+	stderr_contains "fatal: This operation must be run in a work tree"
+'
+
+############################################################################
+#
+# test standard ignores
+
+# First make sure that the presence of a file in the working tree
+# does not impact results, but that the presence of a file in the
+# index does.
+
+for subdir in '' 'a/'
+do
+	if test -z "$subdir"
+	then
+		where="at top-level"
+	else
+		where="in subdir $subdir"
+	fi
+
+	test_expect_success_multi "non-existent file $where not ignored" '' "
+		test_check_ignore '${subdir}non-existent' 1
+	"
+
+	test_expect_success_multi "non-existent file $where ignored" \
+		".gitignore:1:one	${subdir}one" "
+		test_check_ignore '${subdir}one'
+	"
+
+	test_expect_success_multi "existing untracked file $where not ignored" '' "
+		test_check_ignore '${subdir}not-ignored' 1
+	"
+
+	test_expect_success_multi "existing tracked file $where not ignored" '' "
+		test_check_ignore '${subdir}ignored-but-in-index' 1
+	"
+
+	test_expect_success_multi "existing untracked file $where ignored" \
+		".gitignore:2:ignored-*	${subdir}ignored-and-untracked" "
+		test_check_ignore '${subdir}ignored-and-untracked'
+	"
+done
+
+# Having established the above, from now on we mostly test against
+# files which do not exist in the working tree or index.
+
+test_expect_success 'sub-directory local ignore' '
+	expect "a/3-three" &&
+	test_check_ignore "a/3-three a/three-not-this-one"
+'
+
+test_expect_success 'sub-directory local ignore with --verbose'  '
+	expect "a/.gitignore:2:*three	a/3-three" &&
+	test_check_ignore "--verbose a/3-three a/three-not-this-one"
+'
+
+test_expect_success 'local ignore inside a sub-directory' '
+	expect "3-three" &&
+	(
+		cd a &&
+		test_check_ignore "3-three three-not-this-one"
+	)
+'
+test_expect_success 'local ignore inside a sub-directory with --verbose' '
+	expect "a/.gitignore:2:*three	3-three" &&
+	(
+		cd a &&
+		test_check_ignore "--verbose 3-three three-not-this-one"
+	)
+'
+
+test_expect_success_multi 'nested include' \
+	'a/b/.gitignore:8:!on*	a/b/one' '
+	test_check_ignore "a/b/one"
+'
+
+############################################################################
+#
+# test ignored sub-directories
+
+test_expect_success_multi 'ignored sub-directory' \
+	'a/b/.gitignore:5:ignored-dir/	a/b/ignored-dir' '
+	test_check_ignore "a/b/ignored-dir"
+'
+
+test_expect_success 'multiple files inside ignored sub-directory' '
+	expect_from_stdin <<-\EOF &&
+		a/b/ignored-dir/foo
+		a/b/ignored-dir/twoooo
+		a/b/ignored-dir/seven
+	EOF
+	test_check_ignore "a/b/ignored-dir/foo a/b/ignored-dir/twoooo a/b/ignored-dir/seven"
+'
+
+test_expect_success 'multiple files inside ignored sub-directory with -v' '
+	expect_from_stdin <<-\EOF &&
+		a/b/.gitignore:5:ignored-dir/	a/b/ignored-dir/foo
+		a/b/.gitignore:5:ignored-dir/	a/b/ignored-dir/twoooo
+		a/b/.gitignore:5:ignored-dir/	a/b/ignored-dir/seven
+	EOF
+	test_check_ignore "-v a/b/ignored-dir/foo a/b/ignored-dir/twoooo a/b/ignored-dir/seven"
+'
+
+test_expect_success 'cd to ignored sub-directory' '
+	expect_from_stdin <<-\EOF &&
+		foo
+		twoooo
+		../one
+		seven
+		../../one
+	EOF
+	(
+		cd a/b/ignored-dir &&
+		test_check_ignore "foo twoooo ../one seven ../../one"
+	)
+'
+
+test_expect_success 'cd to ignored sub-directory with -v' '
+	expect_from_stdin <<-\EOF &&
+		a/b/.gitignore:5:ignored-dir/	foo
+		a/b/.gitignore:5:ignored-dir/	twoooo
+		a/b/.gitignore:8:!on*	../one
+		a/b/.gitignore:5:ignored-dir/	seven
+		.gitignore:1:one	../../one
+	EOF
+	(
+		cd a/b/ignored-dir &&
+		test_check_ignore "-v foo twoooo ../one seven ../../one"
+	)
+'
+
+############################################################################
+#
+# test handling of symlinks
+
+test_expect_success_multi SYMLINKS 'symlink' '' '
+	test_check_ignore "a/symlink" 1
+'
+
+test_expect_success_multi SYMLINKS 'beyond a symlink' '' '
+	test_check_ignore "a/symlink/foo" 128 &&
+	test_stderr "fatal: '\''a/symlink/foo'\'' is beyond a symbolic link"
+'
+
+test_expect_success_multi SYMLINKS 'beyond a symlink from subdirectory' '' '
+	(
+		cd a &&
+		test_check_ignore "symlink/foo" 128
+	) &&
+	test_stderr "fatal: '\''symlink/foo'\'' is beyond a symbolic link"
+'
+
+############################################################################
+#
+# test handling of submodules
+
+test_expect_success_multi 'submodule' '' '
+	test_check_ignore "a/submodule/one" 128 &&
+	test_stderr "fatal: Path '\''a/submodule/one'\'' is in submodule '\''a/submodule'\''"
+'
+
+test_expect_success_multi 'submodule from subdirectory' '' '
+	(
+		cd a &&
+		test_check_ignore "submodule/one" 128
+	) &&
+	test_stderr "fatal: Path '\''a/submodule/one'\'' is in submodule '\''a/submodule'\''"
+'
+
+############################################################################
+#
+# test handling of global ignore files
+
+test_expect_success 'global ignore not yet enabled' '
+	expect_from_stdin <<-\EOF &&
+		.git/info/exclude:7:per-repo	per-repo
+		a/.gitignore:2:*three	a/globalthree
+		.git/info/exclude:7:per-repo	a/per-repo
+	EOF
+	test_check_ignore "-v globalone per-repo a/globalthree a/per-repo not-ignored a/globaltwo"
+'
+
+test_expect_success 'global ignore' '
+	enable_global_excludes &&
+	expect_from_stdin <<-\EOF &&
+		globalone
+		per-repo
+		globalthree
+		a/globalthree
+		a/per-repo
+		globaltwo
+	EOF
+	test_check_ignore "globalone per-repo globalthree a/globalthree a/per-repo not-ignored globaltwo"
+'
+
+test_expect_success 'global ignore with -v' '
+	enable_global_excludes &&
+	expect_from_stdin <<-EOF &&
+		$global_excludes:1:globalone	globalone
+		.git/info/exclude:7:per-repo	per-repo
+		$global_excludes:3:globalthree	globalthree
+		a/.gitignore:2:*three	a/globalthree
+		.git/info/exclude:7:per-repo	a/per-repo
+		$global_excludes:2:!globaltwo	globaltwo
+	EOF
+	test_check_ignore "-v globalone per-repo globalthree a/globalthree a/per-repo not-ignored globaltwo"
+'
+
+############################################################################
+#
+# test --stdin
+
+cat <<-\EOF >stdin
+	one
+	not-ignored
+	a/one
+	a/not-ignored
+	a/b/on
+	a/b/one
+	a/b/one one
+	"a/b/one two"
+	"a/b/one\"three"
+	a/b/not-ignored
+	a/b/two
+	a/b/twooo
+	globaltwo
+	a/globaltwo
+	a/b/globaltwo
+	b/globaltwo
+EOF
+cat <<-\EOF >expected-default
+	one
+	a/one
+	a/b/on
+	a/b/one
+	a/b/one one
+	a/b/one two
+	"a/b/one\"three"
+	a/b/two
+	a/b/twooo
+	globaltwo
+	a/globaltwo
+	a/b/globaltwo
+	b/globaltwo
+EOF
+cat <<-EOF >expected-verbose
+	.gitignore:1:one	one
+	.gitignore:1:one	a/one
+	a/b/.gitignore:8:!on*	a/b/on
+	a/b/.gitignore:8:!on*	a/b/one
+	a/b/.gitignore:8:!on*	a/b/one one
+	a/b/.gitignore:8:!on*	a/b/one two
+	a/b/.gitignore:8:!on*	"a/b/one\"three"
+	a/b/.gitignore:9:!two	a/b/two
+	a/.gitignore:1:two*	a/b/twooo
+	$global_excludes:2:!globaltwo	globaltwo
+	$global_excludes:2:!globaltwo	a/globaltwo
+	$global_excludes:2:!globaltwo	a/b/globaltwo
+	$global_excludes:2:!globaltwo	b/globaltwo
+EOF
+
+sed -e 's/^"//' -e 's/\\//' -e 's/"$//' stdin | \
+	tr "\n" "\0" >stdin0
+sed -e 's/^"//' -e 's/\\//' -e 's/"$//' expected-default | \
+	tr "\n" "\0" >expected-default0
+sed -e 's/	"/	/' -e 's/\\//' -e 's/"$//' expected-verbose | \
+	tr ":\t\n" "\0" >expected-verbose0
+
+test_expect_success '--stdin' '
+	expect_from_stdin <expected-default &&
+	test_check_ignore "--stdin" <stdin
+'
+
+test_expect_success '--stdin -q' '
+	expect "" &&
+	test_check_ignore "-q --stdin" <stdin
+'
+
+test_expect_success '--stdin -v' '
+	expect_from_stdin <expected-verbose &&
+	test_check_ignore "-v --stdin" <stdin
+'
+
+for opts in '--stdin -z' '-z --stdin'
+do
+	test_expect_success "$opts" "
+		expect_from_stdin <expected-default0 &&
+		test_check_ignore '$opts' <stdin0
+	"
+
+	test_expect_success "$opts -q" "
+		expect "" &&
+		test_check_ignore '-q $opts' <stdin0
+	"
+
+	test_expect_success "$opts -v" "
+		expect_from_stdin <expected-verbose0 &&
+		test_check_ignore '-v $opts' <stdin0
+	"
+done
+
+cat <<-\EOF >stdin
+	../one
+	../not-ignored
+	one
+	not-ignored
+	b/on
+	b/one
+	b/one one
+	"b/one two"
+	"b/one\"three"
+	b/two
+	b/not-ignored
+	b/twooo
+	../globaltwo
+	globaltwo
+	b/globaltwo
+	../b/globaltwo
+EOF
+cat <<-\EOF >expected-default
+	../one
+	one
+	b/on
+	b/one
+	b/one one
+	b/one two
+	"b/one\"three"
+	b/two
+	b/twooo
+	../globaltwo
+	globaltwo
+	b/globaltwo
+	../b/globaltwo
+EOF
+cat <<-EOF >expected-verbose
+	.gitignore:1:one	../one
+	.gitignore:1:one	one
+	a/b/.gitignore:8:!on*	b/on
+	a/b/.gitignore:8:!on*	b/one
+	a/b/.gitignore:8:!on*	b/one one
+	a/b/.gitignore:8:!on*	b/one two
+	a/b/.gitignore:8:!on*	"b/one\"three"
+	a/b/.gitignore:9:!two	b/two
+	a/.gitignore:1:two*	b/twooo
+	$global_excludes:2:!globaltwo	../globaltwo
+	$global_excludes:2:!globaltwo	globaltwo
+	$global_excludes:2:!globaltwo	b/globaltwo
+	$global_excludes:2:!globaltwo	../b/globaltwo
+EOF
+
+sed -e 's/^"//' -e 's/\\//' -e 's/"$//' stdin | \
+	tr "\n" "\0" >stdin0
+sed -e 's/^"//' -e 's/\\//' -e 's/"$//' expected-default | \
+	tr "\n" "\0" >expected-default0
+sed -e 's/	"/	/' -e 's/\\//' -e 's/"$//' expected-verbose | \
+	tr ":\t\n" "\0" >expected-verbose0
+
+test_expect_success '--stdin from subdirectory' '
+	expect_from_stdin <expected-default &&
+	(
+		cd a &&
+		test_check_ignore "--stdin" <../stdin
+	)
+'
+
+test_expect_success '--stdin from subdirectory with -v' '
+	expect_from_stdin <expected-verbose &&
+	(
+		cd a &&
+		test_check_ignore "--stdin -v" <../stdin
+	)
+'
+
+for opts in '--stdin -z' '-z --stdin'
+do
+	test_expect_success "$opts from subdirectory" '
+		expect_from_stdin <expected-default0 &&
+		(
+			cd a &&
+			test_check_ignore "'"$opts"'" <../stdin0
+		)
+	'
+
+	test_expect_success "$opts from subdirectory with -v" '
+		expect_from_stdin <expected-verbose0 &&
+		(
+			cd a &&
+			test_check_ignore "'"$opts"' -v" <../stdin0
+		)
+	'
+done
+
+
+test_done
-- 
1.7.11.7.33.gb8feba5

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox