Git development
 help / color / mirror / Atom feed
* Re: clone --bare vs push
From: Ævar Arnfjörð Bjarmason @ 2011-01-03 18:41 UTC (permalink / raw)
  To: Levend Sayar; +Cc: git
In-Reply-To: <AANLkTi=RNDYrRbyEJXA_c30JEVr=SYUQ01cfA3FyWpLT@mail.gmail.com>

On Mon, Jan 3, 2011 at 19:24, Levend Sayar <levendsayar@gmail.com> wrote:

> 1) When I compare X/.git directory and y.git directory there are many
> objects missing in y.git. What is the reason ?

Maybe things were repacked.

> 2) git clone --bare is too fast. My .git directory is nearly 1GB. How
> can it be copied that much fast ?

It will use hardlinks when using the same filesystem. See
--no-hardlinks in git-clone(1).

^ permalink raw reply

* clone --bare vs push
From: Levend Sayar @ 2011-01-03 18:24 UTC (permalink / raw)
  To: git
In-Reply-To: <AANLkTi=+cRqD_CDFyaYj8uWOxUA1+5Dgr_pv1guaaT40@mail.gmail.com>

Hi, all.

We cloned a repo from github on our local server. Say X for this. At
that repo, we did

git clone --bare X y.git

Now y.git is ready for other machines to clone.

To update our upstream repo X, we do

git pull

and then

git push --all

to update y.git.

Now questions:

1) When I compare X/.git directory and y.git directory there are many
objects missing in y.git. What is the reason ?

2) git clone --bare is too fast. My .git directory is nearly 1GB. How
can it be copied that much fast ?

3) Is this more safe then git pull, git push

rm -rf y.git
git pull
git clone --bare X y.git

Namely bare cloning each time when we update our main repo ?

TIA

_lvnd_
 (^_^)

^ permalink raw reply

* Re: [PATCH] Fix typos in the documentation
From: Drew Northup @ 2011-01-03 17:57 UTC (permalink / raw)
  To: Ralf Wildenhues; +Cc: git
In-Reply-To: <20110102055653.GI19818@gmx.de>


On Sun, 2011-01-02 at 06:56 +0100, Ralf Wildenhues wrote:

<snip for clarity> 
> diff --git a/Documentation/RelNotes/1.7.4.txt b/Documentation/RelNotes/1.7.4.txt
> index b736d39..5619641 100644
> --- a/Documentation/RelNotes/1.7.4.txt
> +++ b/Documentation/RelNotes/1.7.4.txt
> @@ -8,12 +8,11 @@ Updates since v1.7.3
>     docbook-xsl >= 1.73. If you have older versions, you can set
>     ASCIIDOC7 and ASCIIDOC_ROFF, respectively.
>  
> - * The option parsers of various commands that create new branch (or
> + * The option parsers of various commands that create new branches (or
>     rename existing ones to a new name) were too loose and users were
> -   allowed to call a branch with a name that begins with a dash by
> -   creative abuse of their command line options, which only lead to
> -   burn themselves.  The name of a branch cannot begin with a dash
> -   now.
> +   allowed to give a branch a name that begins with a dash by creative
> +   abuse of their command line options, which only led to burn themselves.
> +   The name of a branch cannot begin with a dash now.
>  
>   * System-wide fallback default attributes can be stored in
>     /etc/gitattributes; core.attributesfile configuration variable can
<snip for clarity>

Ralf,
Perhaps that should be:

- * The option parsers of various commands that create new branch (or
+ * The option parsers of various commands that create new branches (or
    rename existing ones to a new name) were too loose and users were
-   allowed to call a branch with a name that begins with a dash by
-   creative abuse of their command line options, which only lead to
-   burn themselves.  The name of a branch cannot begin with a dash
-   now.
+   allowed to give a branch a name that begins with a dash by creative
+   abuse of their command line options, which only led to burning 
+   themselves. The name of a branch cannot begin with a dash now.

(for consistency)?

-- 
-Drew Northup N1XIM
   AKA RvnPhnx on OPN
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply

* Re: Commiting automatically (2)
From: Jakub Narebski @ 2011-01-03 17:34 UTC (permalink / raw)
  To: Maaartin-1; +Cc: git
In-Reply-To: <4D211AA4.4050108@seznam.cz>

On Mon, 3 Jan 2011, Maaartin-1 wrote:
> On 10-12-27 13:04, Jakub Narebski wrote:
>> On Wed, 22 Dec 2010, Maaartin-1 wrote:

>>> Moreover, git-show-ref --head shows all branches and tags, this can't be
>>> right, can it? According to your above explanation, getting HEAD using a
>>> pattern is impossible, so I'd say that's what is "--head" good for.
>>>
>>> Moreover, "git-show-ref --heads" shows less than "git-show-ref --head",
>>> despite the plural.
>> 
>> "git show-ref --head" is strange in that it doesn't play well
>> with '--heads' and '--tags' and '<pattern>'.
>> 
>> I think it is a bit of misdesign, but I don't know how it should be
>> fixed; current output of "git show-ref --head" has to be kept because
>> of backward compatibility - git-show-ref is plumbing.
> 
> I wonder what
> git show-ref --head
> really does. It seems to output everything, is this the expected (albeit
> strange) behavior? Maybe, I know now, s. below.
> 
> For sure, either the doc is completely wrong or the implementation. I
> hope I understand "Show the HEAD reference" correctly as showing the
> HEAD reference, don't I? So it must show a single reference (singular).
> Instead I get all tags and all heads. Could anybody either fix the doc
> or convince me that the many lines I'm seeing are a single one?

Well, it might be that *both* documentation and implementation are wrong.

> 
> Shouldn't there be an option *really* doing what --head is expected and
> documented to do? I mean something like
> git show-ref --head --yes-I-really-mean-the-head
> with the output consisting of a single line like
> 4ba2b422cf3cc229d894bb31c429c0c588de85c0 HEAD
> Maybe it could be called --head-only.
> 
> It could help a lot to add the word "additionally" to the doc like
> --head
> Additionally show the HEAD reference.

Well, actually it doesn't do that.  If '--head' is *alone* ref selector
(e.g. "git show-ref --head") it shows HEAD reference in addition to all
other refs (e.g. what "git show-ref" would show).  But it doesn't seem
to work in described way when combined with any of ref specifiers; neither
"git show-ref --head --heads" not "git show-ref --head master" work as 
one would expect.

> 
>>>> I tripped over strange git-show-ref <pattern> semantic too.
>>>>
>>>> P.S. there is also git-for-each-ref.
>> 
>> I don't know why there is git-show-ref when we have git-for-each-ref
>> for scripting; I guess they were added nearly at the same time...
> 
> I guess, I can get the single line I wanted using
> git for-each-ref $(git symbolic-ref HEAD)
> right?

Well, both git-show-ref and git-for-each-ref are meant for checking or
viewing multiple refs at once.  If you are working with a single ref,
then git-rev-parse (e.g. "git rev-parse --symboolic HEAD" or 
"git rev-parse --symbolic-full-name HEAD") or git-symbolic-ref would be
a better choice IMHO.

-- 
Jakub Narebski
Poland

^ permalink raw reply

* Applying .gitattributes text/eol changes
From: Marc Strapetz @ 2011-01-03 17:18 UTC (permalink / raw)
  To: git

I'm looking for an unobtrusive way to apply (committed) changes for
text/eol attributes to the working tree. For instance, after having
changed "*.txt eol=crlf" to "*.txt eol=lf", all *.txt files should be
converted from CRLF to LF endings. The only advice I found so far is to
remove .git/index and do a reset --hard. The disadvantage of this
approach is that every file will be touched:

- although the content does not change, timestamps will be changed. This
makes tools like IDEs assume that the file content has been changed.
(Even if the timestamps would be properly reset, the replacement of the
files would have triggered system file change notifications and I'd
expect various tools to still reload these files)

- there will be warnings for files which are locked by other processes
(at least on Windows). I'm usually seeing this for JAR files which are
not affected by eol-attribute changes at all.

One solution I could think of which might be helpful in other situations
as well would be to have an "--unobtrusive" option for reset which would
only replace a file if the content has actually been changed.

Marc.

^ permalink raw reply

* Re: fatal: ambiguous message
From: Eric Blake @ 2011-01-03 15:04 UTC (permalink / raw)
  To: Bruce Korb
  Cc: Jonathan Nieder, GNU Autoconf mailing list, GIT Development,
	bug-gnulib
In-Reply-To: <4D211555.1040502@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2793 bytes --]

[redirecting to bug-gnulib as the owner of the git-version-gen script in
question; replies can drop other lists]

On 01/02/2011 05:16 PM, Bruce Korb wrote:
> Hi Jonathan,
> 
> On Sun, Jan 2, 2011 at 10:34 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>> Were you been able to reproduce that outside the script?
> 
> No, I was blind to the invocation.  You found it.  I was looking
> without seeing.  Thank you.
> 
> Given that shells without functions can be considered sufficiently
> obsolete to not be a consideration, perhaps a better solution is
> to put the I-don't-care-about-error-messages code into a separate
> function with stderr redirected.  Doing that turned out messier
> than I had hoped....

Jonathan's patch:

> diff --git a/build-aux/git-version-gen b/build-aux/git-version-gen
> index 5617eb8..119d7aa 100755
> --- a/build-aux/git-version-gen
> +++ b/build-aux/git-version-gen
> @@ -119,7 +119,7 @@ then
>  	    # result is the same as if we were using the newer version
>  	    # of git describe.
>  	    vtag=`echo "$v" | sed 's/-.*//'`
> -	    numcommits=`git rev-list "$vtag"..HEAD | wc -l`
> +	    numcommits=`git rev-list "$vtag"..HEAD 2>/dev/null | wc -l`
>  	    v=`echo "$v" | sed "s/\(.*\)-\(.*\)/\1-$numcommits-\2/"`;
>  	    ;;
>      esac

makes sense to suppress the error message from leaking (whether or not
git can be improved to have the error message claim which program is
issuing the message); but there's still the nagging issue that because
git output is fed to a pipe, there's no way to check $? to see if git
failed, in order to properly react to that situation.

Bruce's patch mixes refactoring with bug fixing, making it a bit harder
to read, and introduced a bug in its own right:

> diff --git a/build-aux/git-version-gen b/build-aux/git-version-gen
> index c278f6a..8a238b0 100755
> --- a/build-aux/git-version-gen
> +++ b/build-aux/git-version-gen
> @@ -1,6 +1,6 @@
>  #!/bin/sh
>  # Print a version string.
> -scriptversion=2010-10-13.20; # UTC
> +scriptversion=2011-01-03.00; # UTC
>  
>  # Copyright (C) 2007-2011 Free Software Foundation, Inc.
>  #
> @@ -78,76 +78,96 @@ tag_sed_script="${2:-s/x/x/}"
>  nl='
>  '
>  
> -# Avoid meddling by environment variable of the same name.
> -v=
> +get_ver()
> +{
> +    local PS4='>gv> '

Portable scripts CANNOT use local (since POSIX does not require it), and
setting PS4 is not commonly done in portable scripting.

I'll probably end up writing yet a third approach, which collects git
rev-list output into a temporary variable in order to correctly detect
failures, without refactoring into a helper function.

-- 
Eric Blake   eblake@redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply

* Stashing subset of changed file plus a new file
From: Zivkov, Sasa @ 2011-01-03 14:05 UTC (permalink / raw)
  To: git@vger.kernel.org

Let's suppose I changed a file e.txt and introduced two changes to it: C1 and
C2. Additionally, I have a new file n.txt in my working tree. My goal is to
stash the change C1 plus the complete file n.txt while keeping the change C2 in
the working tree.

For Git experts who don't want to read the complete post: what is the best
(most intuitive) way to achieve that?


I don't know if I somehow missed a trivial and obvious solution but here is
what I went through:

1) First, I tried:
        > git stash --patch
and answered 'y' for C1 and 'n' for C2. For the e.txt this worked as expected
but git stash didn't ask me about the (changes of the) file n.txt. n.txt
remained in the working tree.  OK, reading through this list I found out that
git stash never looks at the non-tracked files.

2) Second trial:
        > git add n.txt
        > git stash --patch
and again answered the same for C1 and C2 as in 1). However, this again didn't
ask me about the n.txt and n.txt remained in the working tree.

3) Thinking how to make git stash --patch not ignore the n.txt. Obviously, it
looks only at the hunks produced by the diff between the working tree and the
index... let me add the n.txt to the index but not its content (thus producing
the desired diff):
        > git add -N n.txt
        > git stash --patch
        n.txt: not added yet
        fatal: git-write-tree: error building trees
        Cannot save the current index state

4) Similar like 3) trying to make diff between the index and working tree for
n.txt:
        > git add n.txt
        > rm n.txt
        > git stash --patch
this went through but again didn't ask me for n.txt and n.txt wasn't part of
the stash commit:
        > git stash show
        ... no n.txt ...


5) Giving up git stash --patch, using git add --patch:
        > git add --patch
        ... skipped C1, added C2 ...
        > git add n.txt
        > git stash --keep-index
finally, produced the stash that includes the creation of the n.txt! However,
the n.txt is still in both the working tree and the index and it has to be
removed:
        > git rm -f n.txt
using the -f option in order to remove it both from the index and the working
tree.

The 5) is a solution but it has its drawbacks:
- one has to remove the new files after git stash
- when using git add --patch one has to select hunks that have to stay in the
  working tree which is exactly opposite as when using git stash --patch where
  one has to select the hunks to be stashed.
- unintuitive:
        new files are added to the index in order to be stashed
        hunks of diff of an existing file are added to the index in order to stay
        in the working tree

Questions:
Is there any solution better than 5) ?
Is git stash --patch at all able to stash new files?


Sasa Zivkov

^ permalink raw reply

* [PATCH 3/3] fast-import: add 'ls' command
From: Jonathan Nieder @ 2011-01-03  8:37 UTC (permalink / raw)
  To: David Barr
  Cc: Git Mailing List, Ramkumar Ramachandra, Sverre Rabbelier,
	Shawn O. Pearce, Tomas Carnecky
In-Reply-To: <20110103080130.GA8842@burratino>

From: David Barr <david.barr@cordelta.com>
Date: Thu, 2 Dec 2010 21:40:20 +1100

The vcs-svn library currently maintains an in-core index of all paths
in all revisions. Introducing an `ls` command to fast-import would
allow this responsibility to be delegated; and reading this
information from the target repository instead of an in-core data
structure would result in support for resuming an import partway
through (i.e., incremental imports) for free.

There are two forms of the 'ls' command: the two-argument form prints
the entry at <path> for the tree underlying the tree, commit, or tag
named by <dataref>:

	'ls' SP <dataref> SP <path> LF

The one-argument form prints the entry at <path> in fast-import's
active commit.

	'ls' SP <path> LF

Output uses ls-tree format.

Dirty hack: missing paths are assumed to represent the empty
subtree and are printed as

 040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904	path/to/nowhere

to avoid confusing frontends that inserted such a path before.  But
frontends should also be prepared to accept

 missing path/to/nowhere

from backends that (unlike git) distinguish between empty subtrees and
nonentities.

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
The empty tree handling is an evil hack.  One of the tests illustrates
the kind of operation this is meant to support.  It would be easy to
convince me that some other evil hack is better.

This doesn't have tests for the basic functionality.  Maybe there
should be a new t9302-fast-import-bidi.sh so there is less to read to
get started?

No new "feature" for this.  Frontends can easily make a feature test
for themselves if they need it. ;-)  And I still have plans for
"feature command ls" et al, as part of a series including Tomas's
simplified command dispatch.

Only compile tested.  (Something similar to this is very well tested
but that is not enough to prevent accidents.)

Changes from v1:
 - new documentation and demo (tests)
 - refactored peel-to-tree routines
 - mode is always 6 digits
 - path output uses quoting (especially important for filenames
   with \n [though that wouldn't come up in the svn-fe case])
 - persistent buffers to avoid allocation overhead
 - the empty tree hackery
 - mode is based on type, not based on extracting the object itself
 - path after <dataref> does not have to be quoted
 - no-<dataref> form is 'ls "<path>"' instead of 'ls index "<path>"'

Thanks for the original patch and a lot of help improving it go to
David.

'night,
Jonathan

 Documentation/git-fast-import.txt |   49 +++++++++++-
 fast-import.c                     |  162 ++++++++++++++++++++++++++++++++++++-
 t/t9300-fast-import.sh            |   91 +++++++++++++++++++++
 3 files changed, 299 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt
index f56dfca..3957f70 100644
--- a/Documentation/git-fast-import.txt
+++ b/Documentation/git-fast-import.txt
@@ -192,7 +192,8 @@ especially when a higher level language such as Perl, Python or
 Ruby is being used.
 
 fast-import is very strict about its input.  Where we say SP below we mean
-*exactly* one space.  Likewise LF means one (and only one) linefeed.
+*exactly* one space.  Likewise LF means one (and only one) linefeed
+and HT one (and only one) horizontal tab.
 Supplying additional whitespace characters will cause unexpected
 results, such as branch names or file names with leading or trailing
 spaces in their name, or early termination of fast-import when it encounters
@@ -330,6 +331,11 @@ and control the current import process.  More detailed discussion
 	format to the file descriptor set with `--cat-blob-fd` or
 	`stdout` if unspecified.
 
+`ls`::
+	Causes fast-import to print a directory entry in 'ls-tree'
+	format to the file descriptor set with `--cat-blob-fd` or
+	`stdout` if unspecified.
+
 `feature`::
 	Require that fast-import supports the specified feature, or
 	abort if it does not.
@@ -916,6 +922,47 @@ This command can be used anywhere in the stream that comments are
 accepted.  In particular, the `cat-blob` command can be used in the
 middle of a commit but not in the middle of a `data` command.
 
+`ls`
+~~~~
+Prints a directory entry to a file descriptor previously arranged with
+the `--cat-blob-fd` argument.  In the current implementation, if that
+entry represents a subdirectory in the current commit, it will be
+stored in the object database, but it is not advisable to rely on this
+detail since it maybe change.
+
+....
+	'ls' (SP <dataref>)? SP <path> LF
+....
+
+The `<dataref>` can be either a mark reference (`:<idnum>`) or a full
+40-byte SHA-1 of a Git tag, commit, or tree object, preexisting or
+waiting to be written.  The directory entry printed is that named by
+the path, relative to the top level of that tree.
+
+The `ls` command can be used anywhere in the stream that comments are
+accepted, including the middle of a commit.
+
+In the middle of a `commit`, the `<dataref>` part of the command can
+be omitted, in which case the path names a directory entry within
+fast-import's active commit.  The path must be quoted in this case.
+
+Output uses the same format as `git ls-tree <tree> -- <path>`:
+
+====
+	<mode> SP ('blob' | 'tree') SP <dataref> HT <path> LF
+====
+
+Since git repositories do not distinguish between missing paths and
+empty subtrees, if a path is not found it will be reported as an
+empty tree.  Backends for version control systems that do have a
+notion of empty trees may write
+
+====
+	missing SP <path> LF
+====
+
+for paths that do not correspond to a blob or subtree.
+
 `feature`
 ~~~~~~~~~
 Require that fast-import supports the specified feature, or abort if
diff --git a/fast-import.c b/fast-import.c
index 385d12d..21cb109 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -24,10 +24,12 @@ Format of STDIN stream:
     commit_msg
     ('from' sp committish lf)?
     ('merge' sp committish lf)*
-    file_change*
+    (file_change | ls)*
     lf?;
   commit_msg ::= data;
 
+  ls ::= 'ls' sp '"' quoted(path) '"' lf;
+
   file_change ::= file_clr
     | file_del
     | file_rnm
@@ -132,7 +134,7 @@ Format of STDIN stream:
   ts    ::= # time since the epoch in seconds, ascii base10 notation;
   tz    ::= # GIT style timezone;
 
-     # note: comments and cat requests may appear anywhere
+     # note: comments, ls and cat requests may appear anywhere
      # in the input, except within a data command.  Any form
      # of the data command always escapes the related input
      # from comment processing.
@@ -141,7 +143,9 @@ Format of STDIN stream:
      # must be the first character on that line (an lf
      # preceded it).
      #
+
   cat_blob ::= 'cat-blob' sp (hexsha1 | idnum) lf;
+  ls_tree  ::= 'ls' sp (hexsha1 | idnum) sp path_str lf;
 
   comment ::= '#' not_lf* lf;
   not_lf  ::= # Any byte that is not ASCII newline (LF);
@@ -373,6 +377,7 @@ static int cat_blob_fd = STDOUT_FILENO;
 
 static void parse_argv(void);
 static void parse_cat_blob(void);
+static void parse_ls(struct branch *b);
 
 static void write_branch_report(FILE *rpt, struct branch *b)
 {
@@ -2613,6 +2618,8 @@ static void parse_new_commit(void)
 			note_change_n(b, prev_fanout);
 		else if (!strcmp("deleteall", command_buf.buf))
 			file_change_deleteall(b);
+		else if (!prefixcmp(command_buf.buf, "ls "))
+			parse_ls(b);
 		else {
 			unread_command_buf = 1;
 			break;
@@ -2836,6 +2843,155 @@ static void parse_cat_blob(void)
 	cat_blob(oe, sha1);
 }
 
+static struct object_entry *dereference(struct object_entry *oe,
+					unsigned char sha1[20])
+{
+	unsigned long size;
+	void *buf = NULL;
+	if (!oe) {
+		enum object_type type = sha1_object_info(sha1, NULL);
+		if (type < 0)
+			die("object not found: %s", sha1_to_hex(sha1));
+		/* cache it! */
+		oe = insert_object(sha1);
+		oe->type = type;
+		oe->pack_id = MAX_PACK_ID;
+		oe->idx.offset = 1;
+	}
+	switch (oe->type) {
+	case OBJ_TREE:	/* easy case. */
+		return oe;
+	case OBJ_COMMIT:
+	case OBJ_TAG:
+		break;
+	default:
+		die("Not a treeish: %s", command_buf.buf);
+	}
+
+	if (oe->pack_id != MAX_PACK_ID) {	/* in a pack being written */
+		buf = gfi_unpack_entry(oe, &size);
+	} else {
+		enum object_type unused;
+		buf = read_sha1_file(sha1, &unused, &size);
+	}
+	if (!buf)
+		die("Can't load object %s", sha1_to_hex(sha1));
+
+	/* Peel one layer. */
+	switch (oe->type) {
+	case OBJ_TAG:
+		if (size < 40 + strlen("object ") ||
+		    get_sha1_hex(buf + strlen("object "), sha1))
+			die("Invalid SHA1 in tag: %s", command_buf.buf);
+		break;
+	case OBJ_COMMIT:
+		if (size < 40 + strlen("tree ") ||
+		    get_sha1_hex(buf + strlen("tree "), sha1))
+			die("Invalid SHA1 in commit: %s", command_buf.buf);
+	}
+
+	free(buf);
+	return find_object(sha1);
+}
+
+static struct object_entry *parse_treeish_dataref(const char **p)
+{
+	unsigned char sha1[20];
+	struct object_entry *e;
+
+	if (**p == ':') {	/* <mark> */
+		char *endptr;
+		e = find_mark(strtoumax(*p + 1, &endptr, 10));
+		if (endptr == *p + 1)
+			die("Invalid mark: %s", command_buf.buf);
+		if (!e)
+			die("Unknown mark: %s", command_buf.buf);
+		*p = endptr;
+		hashcpy(sha1, e->idx.sha1);
+	} else {	/* <sha1> */
+		if (get_sha1_hex(*p, sha1))
+			die("Invalid SHA1: %s", command_buf.buf);
+		e = find_object(sha1);
+		*p += 40;
+	}
+
+	while (!e || e->type != OBJ_TREE)
+		e = dereference(e, sha1);
+	return e;
+}
+
+static void print_ls(int mode, const unsigned char *sha1, const char *path)
+{
+	static struct strbuf line = STRBUF_INIT;
+
+	/* See show_tree(). */
+	const char *type =
+		S_ISGITLINK(mode) ? commit_type :
+		S_ISDIR(mode) ? tree_type :
+		blob_type;
+
+	/* mode SP type SP object_name TAB path LF */
+	strbuf_reset(&line);
+	strbuf_addf(&line, "%06o %s %s\t",
+			mode, type, sha1_to_hex(sha1));
+	quote_c_style(path, &line, NULL, 0);
+	strbuf_addch(&line, '\n');
+	cat_blob_write(line.buf, line.len);
+}
+
+static void parse_ls(struct branch *b)
+{
+	const char *p;
+	struct tree_entry *root = NULL;
+	struct tree_entry leaf = {0};
+
+	/* ls SP (<treeish> SP)? <path> */
+	p = command_buf.buf + strlen("ls ");
+	if (*p == '"') {
+		if (!b)
+			die("Not in a commit: %s", command_buf.buf);
+		root = &b->branch_tree;
+	} else {
+		struct object_entry *e = parse_treeish_dataref(&p);
+		root = new_tree_entry();
+		hashcpy(root->versions[1].sha1, e->idx.sha1);
+		load_tree(root);
+		if (*p++ != ' ')
+			die("Missing space after tree-ish: %s", command_buf.buf);
+	}
+	if (*p == '"') {
+		static struct strbuf uq = STRBUF_INIT;
+		const char *endp;
+		strbuf_reset(&uq);
+		if (unquote_c_style(&uq, p, &endp))
+			die("Invalid path: %s", command_buf.buf);
+		if (*endp)
+			die("Garbage after path in: %s", command_buf.buf);
+		p = uq.buf;
+	}
+	tree_content_get(root, p, &leaf);
+	if (!leaf.versions[1].mode) {
+		/*
+		 * Missing path?  Must be an empty subtree!
+		 *
+		 * When git learns to track empty directories, we can report
+		 * this by saying 'missing "path/to/directory"' instead.
+		 */
+		print_ls(S_IFDIR, (const unsigned char *) EMPTY_TREE_SHA1_BIN, p);
+	} else {
+		/*
+		 * A directory in preparation would have a sha1 of zero
+		 * until it is saved.  Save, for simplicity.
+		 */
+		if (S_ISDIR(leaf.versions[1].mode))
+			store_tree(&leaf);
+
+		print_ls(leaf.versions[1].mode, leaf.versions[1].sha1, p);
+	}
+	if (!b || root != &b->branch_tree)
+		release_tree_entry(root);
+}
+
 static void checkpoint(void)
 {
 	checkpoint_requested = 0;
@@ -3131,6 +3287,8 @@ int main(int argc, const char **argv)
 	while (read_next_command() != EOF) {
 		if (!strcmp("blob", command_buf.buf))
 			parse_new_blob();
+		else if (!prefixcmp(command_buf.buf, "ls "))
+			parse_ls(NULL);
 		else if (!prefixcmp(command_buf.buf, "commit "))
 			parse_new_commit();
 		else if (!prefixcmp(command_buf.buf, "tag "))
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index b9aa3f0..6842b1f 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -906,6 +906,97 @@ test_expect_success \
 	 git diff-tree -C --find-copies-harder -r N4^ N4 >actual &&
 	 compare_diff_raw expect actual'
 
+test_expect_success PIPE 'N: read and copy directory' '
+	cat >expect <<-\EOF
+	:100755 100755 f1fb5da718392694d0076d677d6d0e364c79b0bc f1fb5da718392694d0076d677d6d0e364c79b0bc C100	file2/newf	file3/newf
+	:100644 100644 7123f7f44e39be127c5eb701e5968176ee9d78b1 7123f7f44e39be127c5eb701e5968176ee9d78b1 C100	file2/oldf	file3/oldf
+	EOF
+	git update-ref -d refs/heads/N4 &&
+	rm -f backflow &&
+	mkfifo backflow &&
+	(
+		exec <backflow &&
+		cat <<-EOF &&
+		commit refs/heads/N4
+		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+		data <<COMMIT
+		copy by tree hash, part 2
+		COMMIT
+
+		from refs/heads/branch^0
+		ls "file2"
+		EOF
+		read mode type tree filename &&
+		echo "M 040000 $tree file3"
+	) |
+	git fast-import --cat-blob-fd=3 3>backflow &&
+	git diff-tree -C --find-copies-harder -r N4^ N4 >actual &&
+	compare_diff_raw expect actual
+'
+
+test_expect_success PIPE 'N: read and copy "empty" directory' '
+	cat <<-\EOF >expect &&
+	OBJNAME
+	:000000 100644 OBJNAME OBJNAME A	greeting
+	OBJNAME
+	:100644 000000 OBJNAME OBJNAME D	unrelated
+	OBJNAME
+	:000000 100644 OBJNAME OBJNAME A	unrelated
+	EOF
+	git update-ref -d refs/heads/copy-empty &&
+	rm -f backflow &&
+	mkfifo backflow &&
+	(
+		exec <backflow &&
+		cat <<-EOF &&
+		commit refs/heads/copy-empty
+		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+		data <<COMMIT
+		copy "empty" (missing) directory
+		COMMIT
+
+		M 100644 inline src/greeting
+		data <<BLOB
+		hello
+		BLOB
+		C src/greeting dst1/non-greeting
+		C src/greeting unrelated
+		# leave behind "empty" src directory
+		D src/greeting
+		ls "src"
+		EOF
+		read mode type tree filename &&
+		sed -e "s/X\$//" <<-EOF
+		M $mode $tree dst1
+		M $mode $tree dst2
+
+		commit refs/heads/copy-empty
+		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+		data <<COMMIT
+		copy empty directory to root
+		COMMIT
+
+		M $mode $tree X
+
+		commit refs/heads/copy-empty
+		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+		data <<COMMIT
+		add another file
+		COMMIT
+
+		M 100644 inline greeting
+		data <<BLOB
+		hello
+		BLOB
+		EOF
+	) |
+	git fast-import --cat-blob-fd=3 3>backflow &&
+	git rev-list copy-empty |
+	git diff-tree -r --root --stdin |
+	sed "s/$_x40/OBJNAME/g" >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success \
 	'N: delete directory by copying' \
 	'cat >expect <<-\EOF &&
-- 
1.7.4.rc0.580.g89dc.dirty

^ permalink raw reply related

* [PATCH 2/3] fast-import: treat filemodify with empty tree as delete
From: Jonathan Nieder @ 2011-01-03  8:24 UTC (permalink / raw)
  To: David Barr
  Cc: Git Mailing List, Ramkumar Ramachandra, Sverre Rabbelier,
	Shawn O. Pearce
In-Reply-To: <20110103080130.GA8842@burratino>

Date: Sat, 11 Dec 2010 16:42:28 -0600

Traditionally, git trees do not contain entries for empty
subdirectories.  Generally speaking, subtrees are not created or
destroyed explicitly; instead, they automatically appear when needed
to hold regular files, symlinks, and submodules.

v1.7.3-rc0~75^2 (Teach fast-import to import subtrees named by tree
id, 2010-06-30) changed that, by allowing an empty subtree to be
included in a fast-import stream explicitly:

	M 040000 4b825dc642cb6eb9a060e54bf8d69288fbee4904 subdir

That was unintentional.  Better and more closely analogous to "git
read-tree --prefix" to treat such an input line as a request to delete
("to empty") subdir.

Noticed-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
If this seems like a good idea it might be a candidate for v1.7.4.x.
Perhaps fsck.c should learn a "no empty trees" rule, too.

 fast-import.c          |   10 ++++++++
 t/t9300-fast-import.sh |   58 +++++++++++++++++++++++++++++++++++++++++------
 2 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index a5cea45..385d12d 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -2231,6 +2231,16 @@ static void file_change_m(struct branch *b)
 		p = uq.buf;
 	}
 
+	/*
+	 * Git does not track empty, non-toplevel directories.
+	 */
+	if (S_ISDIR(mode) &&
+	    !memcmp(sha1, (const unsigned char *) EMPTY_TREE_SHA1_BIN, 20) &&
+	    *p) {
+		tree_content_remove(&b->branch_tree, p, NULL);
+		return;
+	}
+
 	if (S_ISGITLINK(mode)) {
 		if (inline_data)
 			die("Git links cannot be specified 'inline': %s",
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 53aad51..b9aa3f0 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -42,6 +42,14 @@ echo "$@"'
 
 >empty
 
+test_expect_success 'setup: have pipes?' '
+	rm -f frob &&
+	if mkfifo frob
+	then
+		test_set_prereq PIPE
+	fi
+'
+
 ###
 ### series A
 ###
@@ -899,6 +907,48 @@ test_expect_success \
 	 compare_diff_raw expect actual'
 
 test_expect_success \
+	'N: delete directory by copying' \
+	'cat >expect <<-\EOF &&
+	OBJID
+	:100644 000000 OBJID OBJID D	foo/bar/qux
+	OBJID
+	:000000 100644 OBJID OBJID A	foo/bar/baz
+	:000000 100644 OBJID OBJID A	foo/bar/qux
+	EOF
+	 empty_tree=$(git mktree </dev/null) &&
+	 cat >input <<-INPUT_END &&
+	commit refs/heads/N-delete
+	committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data <<COMMIT
+	collect data to be deleted
+	COMMIT
+
+	deleteall
+	M 100644 inline foo/bar/baz
+	data <<DATA_END
+	hello
+	DATA_END
+	C "foo/bar/baz" "foo/bar/qux"
+	C "foo/bar/baz" "foo/bar/quux/1"
+	C "foo/bar/baz" "foo/bar/quuux"
+	M 040000 $empty_tree foo/bar/quux
+	M 040000 $empty_tree foo/bar/quuux
+
+	commit refs/heads/N-delete
+	committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+	data <<COMMIT
+	delete subdirectory
+	COMMIT
+
+	M 040000 $empty_tree foo/bar/qux
+	INPUT_END
+	 git fast-import <input &&
+	 git rev-list N-delete |
+		git diff-tree -r --stdin --root --always |
+		sed -e "s/$_x40/OBJID/g" >actual &&
+	 test_cmp expect actual'
+
+test_expect_success \
 	'N: copy root directory by tree hash' \
 	'cat >expect <<-\EOF &&
 	:100755 000000 f1fb5da718392694d0076d677d6d0e364c79b0bc 0000000000000000000000000000000000000000 D	file3/newf
@@ -1898,14 +1948,6 @@ test_expect_success 'R: print two blobs to stdout' '
 	test_cmp expect actual
 '
 
-test_expect_success 'setup: have pipes?' '
-	rm -f frob &&
-	if mkfifo frob
-	then
-		test_set_prereq PIPE
-	fi
-'
-
 test_expect_success PIPE 'R: copy using cat-file' '
 	expect_id=$(git hash-object big) &&
 	expect_len=$(wc -c <big) &&
-- 
1.7.4.rc0.580.g89dc.dirty

^ permalink raw reply related

* [PATCH 1/3] fast-import: clarify handling of cat-blob feature
From: Jonathan Nieder @ 2011-01-03  8:22 UTC (permalink / raw)
  To: David Barr
  Cc: Git Mailing List, Ramkumar Ramachandra, Sverre Rabbelier,
	Shawn O. Pearce
In-Reply-To: <20110103080130.GA8842@burratino>

Date: Thu Dec 9 14:45:21 2010 -0600

Remove the undocumented --cat-blob command line option.  It used to be
a no-op.

While at it, move parsing of --cat-blob-fd to parse_one_feature; this
makes the parse_argv loop a little easier to read and puts the code
implementing 'feature cat-blob' and --cat-blob-fd closer to each
other.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Forgot to mention: these are based against v1.7.4-rc0~24 (t9300: use
perl "head -c" clone in place of "dd bs=1 count=16000" kluge,
2010-12-13) but I wouldn't be surprised if they apply cleanly to other
commits, too. ;-)

 fast-import.c          |    9 +++------
 t/t9300-fast-import.sh |    9 +++++++++
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index 7857760..a5cea45 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -2977,8 +2977,10 @@ static int parse_one_feature(const char *feature, int from_stream)
 		option_import_marks(feature + 13, from_stream);
 	} else if (!prefixcmp(feature, "export-marks=")) {
 		option_export_marks(feature + 13);
-	} else if (!strcmp(feature, "cat-blob")) {
+	} else if (from_stream && !strcmp(feature, "cat-blob")) {
 		; /* Don't die - this feature is supported */
+	} else if (!from_stream && !prefixcmp(feature, "cat-blob-fd=")) {
+		option_cat_blob_fd(feature + strlen("cat-blob-fd="));
 	} else if (!prefixcmp(feature, "relative-marks")) {
 		relative_marks_paths = 1;
 	} else if (!prefixcmp(feature, "no-relative-marks")) {
@@ -3073,11 +3075,6 @@ static void parse_argv(void)
 		if (parse_one_feature(a + 2, 0))
 			continue;
 
-		if (!prefixcmp(a + 2, "cat-blob-fd=")) {
-			option_cat_blob_fd(a + 2 + strlen("cat-blob-fd="));
-			continue;
-		}
-
 		die("unknown option %s", a);
 	}
 	if (i != global_argc)
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 222d105..53aad51 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -1769,10 +1769,19 @@ test_expect_success 'R: feature cat-blob supported' '
 	git fast-import
 '
 
+test_expect_success 'R: no command line option for cat-blob feature' '
+	test_must_fail git fast-import --cat-blob <empty
+'
+
 test_expect_success 'R: cat-blob-fd must be a nonnegative integer' '
 	test_must_fail git fast-import --cat-blob-fd=-1 </dev/null
 '
 
+test_expect_success 'R: cat-blob-fd cannot be specified in stream' '
+	echo "feature cat-blob-fd=1" |
+	test_must_fail git fast-import
+'
+
 test_expect_success 'R: print old blob' '
 	blob=$(echo "yes it can" | git hash-object -w --stdin) &&
 	cat >expect <<-EOF &&
-- 
1.7.4.rc0.580.g89dc.dirty

^ permalink raw reply related

* [PATCH/RFC v2 0/3] fast-import: add 'ls' command
From: Jonathan Nieder @ 2011-01-03  8:01 UTC (permalink / raw)
  To: David Barr
  Cc: Git Mailing List, Ramkumar Ramachandra, Sverre Rabbelier,
	Shawn O. Pearce
In-Reply-To: <1291286420-13591-1-git-send-email-david.barr@cordelta.com>

David Barr wrote:

> This patch is by no means complete - I still need to consider the edge cases.
> It does achieve the basic requirements for simplifying svn-fe.

It really does do that.  About time for a reroll.

Patches 1 and 2 are nearby fixes noticed while hacking at this.
Changes in patch 3 from v1 will be mentioned in the same message as
the patch.

Thoughts, improvements, especially tests welcome.  Let's get this
feature ready for wide use.

David Barr (1):
  fast-import: add 'ls' command

Jonathan Nieder (2):
  fast-import: clarify handling of cat-blob feature
  fast-import: treat filemodify with empty tree as delete

 Documentation/git-fast-import.txt |   49 ++++++++++-
 fast-import.c                     |  181 +++++++++++++++++++++++++++++++++++--
 t/t9300-fast-import.sh            |  158 ++++++++++++++++++++++++++++++--
 3 files changed, 371 insertions(+), 17 deletions(-)

-- 
1.7.4.rc0

^ permalink raw reply

* Re: [RFC/PATCH] Documentation/technical: document quoting API
From: Nguyen Thai Ngoc Duy @ 2011-01-03  7:39 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Christian Couder, Jeff King, Dmitry Potapov
In-Reply-To: <20110103063534.GA3661@burratino>

On Mon, Jan 3, 2011 at 1:35 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Briefly explain the zoo of quoting functions.

I rarely read these documents. One of the reasons is sometimes I don't
know such document exists. Can we put a pointer to the document in
quote.h? I assume that anyone who wants to use these will at least
look at quote.h first.
-- 
Duy

^ permalink raw reply

* [RFC/PATCH] Documentation/technical: document quoting API
From: Jonathan Nieder @ 2011-01-03  6:35 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Jeff King, Dmitry Potapov

Briefly explain the zoo of quoting functions.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
This is just a starting point, I fear.  Not even compile-tested.
Problems:

 - not very brief
 - not a great overview

But I was happy to have the chance to read through the available
functions.

Suggestions and other improvements welcome.

 Documentation/technical/api-quote.txt |  124 ++++++++++++++++++++++++++++++---
 quote.h                               |   14 ++--
 2 files changed, 121 insertions(+), 17 deletions(-)
 rewrite Documentation/technical/api-quote.txt (84%)

diff --git a/Documentation/technical/api-quote.txt b/Documentation/technical/api-quote.txt
dissimilarity index 84%
index e8a1bce..18da370 100644
--- a/Documentation/technical/api-quote.txt
+++ b/Documentation/technical/api-quote.txt
@@ -1,10 +1,114 @@
-quote API
-=========
-
-Talk about <quote.h>, things like
-
-* sq_quote and unquote
-* c_style quote and unquote
-* quoting for foreign languages
-
-(JC)
+quote API
+=========
+
+The quoting API can be used to replace unusual characters for
+shell safety or for output readability and parseability.
+It also can be used to perform the inverse operation and recover
+the unusual characters again.
+
+C-style quoting
+---------------
+
+`quote_c_style` quotes a string in a manner that might be familiar
+to C programmers.  This can be used to quote newlines and tabs in
+filenames for patches, for example.
+
+. if sb and fp are both NULL, it returns the number of bytes needed
+  to hold the quoted version of "name", counting the double quotes
+  around it but not terminating NUL.  If "name" does not need quoting,
+  it returns 0.
+
+. otherwise, it emits the quoted version of "name" to a stream,
+  strbuf, or both.  Output will have enclosing double quotes
+  suppressed if requested with the "no_dq" parameter.
+
+`quote_two_c_style`::
+	Quote two paths (prefix + path) in C-style and concatenate them.
+	One should use this instead of calling `quote_c_style` twice
+	to avoid unsightly quotation marks in the middle.
+
+`unquote_c_style`::
+	This unwraps what quote_c_style() produces in place,
+	but returns -1 and doesn't touch if the input does not start with
+	a double-quote or otherwise differs from what quote_c_style
+	would have produced.  Though note that this function will
+	allocate memory in the strbuf, so calling `strbuf_release`
+	is mandatory regardless of the result `unquote_c_style` returns.
++
+Updates the endp pointer to point at one past the ending double quote
+if given.
+
+`write_name_quoted`::
+	`write_name_quoted` is like `quote_c_style` but takes a
+	different set of arguments.
+	Instead of asking for quotes or not, you pass a "terminator".
+	If it's \0 then we assume you don't want to escape, else C
+	escaping is performed. In any case, the terminator is also
+	appended to the stream.
+
+`write_name_quotedpfx`::
+	`write_name_quotedpfx` works like `write_name_quoted` but takes
+	prefix/prefix_len arguments.  The first "prefix_len" characters
+	of "prefix" will be prepended when emiting "name".
+
+`write_name_quoted_relative`::
+	This is a sort of converse to `write_name_quotedpfx`.
+	The path "name" is made relative to the directory described by
+	prefix and prefix_len by stripping away path components and
+	prepending `../` when necessary before quoting.
+
+Quoting for the shell
+---------------------
+
+`sq_quote` copies its argument quoted for the shell safety.
+Any single quote is replaced with '\'', any exclamation point
+is replaced with '\!', and the whole thing is enclosed in a
+single-quote pair.
+
+For example, if you are passing the result to `system()` as an
+argument:
+--------------
+sprintf(cmd, "foobar %s %s", sq_quote(arg0), sq_quote(arg1))
+--------------
+would be appropriate.  If the `system()` is going to call 'ssh' to
+run the command on the other side:
+--------------
+sprintf(cmd, "git-diff-tree %s %s", sq_quote(arg0), sq_quote(arg1));
+sprintf(rcmd, "ssh %s %s", sq_quote(host), sq_quote(cmd));
+--------------
+Note that the above examples leak memory!  Remember to free result from
+`sq_quote()` in a real application.
+
+`sq_quote_print`::
+	Writes to a stream instead of a new buffer.
+
+`sq_quote_buf`::
+	Appends to a strbuf instead of allocating a new buffer.
+
+`sq_quote_argv`::
+	Appends a list of command-line-ready arguments to "dst",
+	each preceded by a space character.  This function is
+	available for scripted use as 'git rev-parse --sq-quote'.
+
+`sq_dequote`::
+	This unwraps what sq_quote() produces in place, but returns
+	NULL if the input does not look like what sq_quote would have
+	produced.
+
+`sq_dequote_to_argv`::
+	Like the above, but unwrap many arguments in the same string
+	separated by space. "*argv", "*nr", and "*alloc" should be a
+	pointer to a malloc-ed array (or NULL), its current number of
+	valid elements, and the number of allocated elements, for
+	example as managed with ALLOC_GROW.  The result is appended
+	after the valid part of *argv.
+
+Quoting for other languages
+---------------------------
+
+`perl_quote_print`::
+`python_quote_print`::
+`tcl_quote_print`::
+	Quote as a string literal for evaluation in the specified
+	language.  This is used by 'git for-each-ref' to output
+	various aspects of objects for use by language bindings.
diff --git a/quote.h b/quote.h
index 38003bf..cec40b0 100644
--- a/quote.h
+++ b/quote.h
@@ -23,9 +23,8 @@
  * Note that the above examples leak memory!  Remember to free result from
  * sq_quote() in a real application.
  *
- * sq_quote_buf() writes to an existing buffer of specified size; it
- * will return the number of characters that would have been written
- * excluding the final null regardless of the buffer size.
+ * sq_quote_buf() writes to the end of a strbuf instead of
+ * allocating a new buffer.
  */
 
 extern void sq_quote_print(FILE *stream, const char *src);
@@ -40,10 +39,11 @@ extern void sq_quote_argv(struct strbuf *, const char **argv, size_t maxlen);
 extern char *sq_dequote(char *);
 
 /*
- * Same as the above, but can be used to unwrap many arguments in the
- * same string separated by space. "next" is changed to point to the
- * next argument that should be passed as first parameter. When there
- * is no more argument to be dequoted, "next" is updated to point to NULL.
+ * Similar to the above, but unwraps many arguments in the
+ * same string separated by space. "*argv" is expanded to hold
+ * the dequoted arguments in positions (*argv)[*nr], *argv[*nr+1], etc
+ * and *nr and *alloc updated to hold the new number of entries
+ * and allocated size of the array.
  */
 extern int sq_dequote_to_argv(char *arg, const char ***argv, int *nr, int *alloc);
 
-- 
1.7.4.rc0.580.g89dc.dirty

^ permalink raw reply related

* Re: gitattributes don't work
From: Nguyen Thai Ngoc Duy @ 2011-01-03  4:34 UTC (permalink / raw)
  To: Marcin Wiśnicki; +Cc: git, Joshua Jensen
In-Reply-To: <ifr610$3kl$1@dough.gmane.org>

2011/1/3 Marcin Wiśnicki <mwisnicki@gmail.com>:
> I'm trying to exclude certain paths (those that contain "xmac/gen/") from
> diff output using .git/info/attributes (not .gitattributes).
>
> According to gitattributes(5) it supports patterns from gitignore(5).
>
> Example path that must be excluded:
> src/byucc/jhdl/CSRC/xmac/gen/and2_dp_g.xmac
>
> What I've tried but didn't work:
> xmac/gen/ -diff
>
> Following works but is not what I want:
> *.xmac -diff
>
> It seems I can only get it to work for file names but not for whole paths.
> What am I doing wrong or is this a bug ?

While gitattributes(5) says that, actually gitattributes and gitignore
use different matching implementations. gitattributes one seems
unchanged since its introduction in d0bfd02 (Add basic infrastructure
to assign attributes to paths - 2007-04-12). gitignore on the other
hand learned foo/ pattern later in d6b8fc3 (gitignore(5): Allow "foo/"
in ignore list to match directory "foo" - 2008-01-31).

Yeah, it looks like a bug to me. A better way to solve this once and
for all, is to unify the two implementations (which is good for
gitattr because there have been optimizations added to gitignore). I
tried long ago and gave up. Something to do with the order of matching
(gitignore tries inner directories first, while gitattr starts from
outer ones).

For the time being, anyone who changes gitignore should be reminded to
consider whether it's applicable to gitattributes and vice versa.

Which reminds me, Joshua, maybe you should add case-insensitive
support to gitattributes too ;-)
-- 
Duy

^ permalink raw reply

* [PATCH 12/12] vcs-svn: teach line_buffer about temporary files
From: Jonathan Nieder @ 2011-01-03  3:10 UTC (permalink / raw)
  To: git; +Cc: David Barr, Thomas Rast, Ramkumar Ramachandra
In-Reply-To: <20110103030328.GA10143@burratino>

It can sometimes be useful to write information temporarily to file,
to read back later.  These functions allow a program to use the
line_buffer facilities when doing so.

It works like this:

 1. find a unique filename with buffer_tmpfile_init.
 2. rewind with buffer_tmpfile_rewind.  This returns a stdio
    handle for writing.
 3. when finished writing, declare so with
    buffer_tmpfile_prepare_to_read.  The return value indicates
    how many bytes were written.
 4. read whatever portion of the file is needed.
 5. if finished, remove the temporary file with buffer_deinit.
    otherwise, go back to step 2,

The svn support would use this to buffer the postimage from delta
application until the length is known and fast-import can receive
the resulting blob.

Based-on-patch-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
David, this is another piece of infrastructure from early svn-fe3
history.  I've cleaned up the API a little bit but the idea is the
same.  Thank you.

 vcs-svn/line_buffer.c   |   24 ++++++++++++++++++++++++
 vcs-svn/line_buffer.h   |    7 ++++++-
 vcs-svn/line_buffer.txt |   22 ++++++++++++++++++++++
 3 files changed, 52 insertions(+), 1 deletions(-)

diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
index e29a81a..aedf105 100644
--- a/vcs-svn/line_buffer.c
+++ b/vcs-svn/line_buffer.c
@@ -25,6 +25,14 @@ int buffer_fdinit(struct line_buffer *buf, int fd)
 	return 0;
 }
 
+int buffer_tmpfile_init(struct line_buffer *buf)
+{
+	buf->infile = tmpfile();
+	if (!buf->infile)
+		return -1;
+	return 0;
+}
+
 int buffer_deinit(struct line_buffer *buf)
 {
 	int err;
@@ -35,6 +43,22 @@ int buffer_deinit(struct line_buffer *buf)
 	return err;
 }
 
+FILE *buffer_tmpfile_rewind(struct line_buffer *buf)
+{
+	rewind(buf->infile);
+	return buf->infile;
+}
+
+long buffer_tmpfile_prepare_to_read(struct line_buffer *buf)
+{
+	long pos = ftell(buf->infile);
+	if (pos < 0)
+		return error("ftell error: %s", strerror(errno));
+	if (fseek(buf->infile, 0, SEEK_SET))
+		return error("seek error: %s", strerror(errno));
+	return pos;
+}
+
 int buffer_read_char(struct line_buffer *buf)
 {
 	return fgetc(buf->infile);
diff --git a/vcs-svn/line_buffer.h b/vcs-svn/line_buffer.h
index 630d83c..96ce966 100644
--- a/vcs-svn/line_buffer.h
+++ b/vcs-svn/line_buffer.h
@@ -15,12 +15,17 @@ struct line_buffer {
 int buffer_init(struct line_buffer *buf, const char *filename);
 int buffer_fdinit(struct line_buffer *buf, int fd);
 int buffer_deinit(struct line_buffer *buf);
+void buffer_reset(struct line_buffer *buf);
+
+int buffer_tmpfile_init(struct line_buffer *buf);
+FILE *buffer_tmpfile_rewind(struct line_buffer *buf);	/* prepare to write. */
+long buffer_tmpfile_prepare_to_read(struct line_buffer *buf);
+
 char *buffer_read_line(struct line_buffer *buf);
 char *buffer_read_string(struct line_buffer *buf, uint32_t len);
 int buffer_read_char(struct line_buffer *buf);
 void buffer_read_binary(struct line_buffer *buf, struct strbuf *sb, uint32_t len);
 void buffer_copy_bytes(struct line_buffer *buf, uint32_t len);
 void buffer_skip_bytes(struct line_buffer *buf, uint32_t len);
-void buffer_reset(struct line_buffer *buf);
 
 #endif
diff --git a/vcs-svn/line_buffer.txt b/vcs-svn/line_buffer.txt
index 4e8fb71..e89cc41 100644
--- a/vcs-svn/line_buffer.txt
+++ b/vcs-svn/line_buffer.txt
@@ -24,6 +24,28 @@ The calling program:
 When finished, the caller can use `buffer_reset` to deallocate
 resources.
 
+Using temporary files
+---------------------
+
+Temporary files provide a place to store data that should not outlive
+the calling program.  A program
+
+ - initializes a `struct line_buffer` to LINE_BUFFER_INIT
+ - requests a temporary file with `buffer_tmpfile_init`
+ - acquires an output handle by calling `buffer_tmpfile_rewind`
+ - uses standard I/O functions like `fprintf` and `fwrite` to fill
+   the temporary file
+ - declares writing is over with `buffer_tmpfile_prepare_to_read`
+ - can re-read what was written with `buffer_read_line`,
+   `buffer_read_string`, and so on
+ - can reuse the temporary file by calling `buffer_tmpfile_rewind`
+   again
+ - removes the temporary file with `buffer_deinit`, perhaps to
+   reuse the line_buffer for some other file.
+
+When finished, the calling program can use `buffer_reset` to deallocate
+resources.
+
 Functions
 ---------
 
-- 
1.7.4.rc0.580.g89dc.dirty

^ permalink raw reply related

* [PATCH 11/12] vcs-svn: allow input from file descriptor
From: Jonathan Nieder @ 2011-01-03  3:09 UTC (permalink / raw)
  To: git; +Cc: David Barr, Thomas Rast, Ramkumar Ramachandra
In-Reply-To: <20110103030328.GA10143@burratino>

Based-on-patch-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
No fd_buffer.  This is essentially the way David did it in the first
place; sorry to have doubted.

 t/t0081-line-buffer.sh  |    9 +++++++++
 test-line-buffer.c      |   11 ++++++++---
 vcs-svn/line_buffer.c   |    8 ++++++++
 vcs-svn/line_buffer.h   |    1 +
 vcs-svn/line_buffer.txt |    9 +++++----
 5 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/t/t0081-line-buffer.sh b/t/t0081-line-buffer.sh
index a8eeb20..550fad0 100755
--- a/t/t0081-line-buffer.sh
+++ b/t/t0081-line-buffer.sh
@@ -131,6 +131,15 @@ test_expect_success PIPE,EXPENSIVE 'longer read (around 65536 bytes)' '
 	long_read_test 65536
 '
 
+test_expect_success 'read from file descriptor' '
+	rm -f input &&
+	echo hello >expect &&
+	echo hello >input &&
+	echo copy 6 |
+	test-line-buffer "&4" 4<input >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'buffer_read_string copes with null byte' '
 	>expect &&
 	q_to_nul <<-\EOF | test-line-buffer >actual &&
diff --git a/test-line-buffer.c b/test-line-buffer.c
index 19bf2d4..25b20b9 100644
--- a/test-line-buffer.c
+++ b/test-line-buffer.c
@@ -69,13 +69,18 @@ int main(int argc, char *argv[])
 	else if (argc == 2)
 		filename = argv[1];
 	else
-		usage("test-line-buffer [file] < script");
+		usage("test-line-buffer [file | &fd] < script");
 
 	if (buffer_init(&stdin_buf, NULL))
 		die_errno("open error");
 	if (filename) {
-		if (buffer_init(&file_buf, filename))
-			die_errno("error opening %s", filename);
+		if (*filename == '&') {
+			if (buffer_fdinit(&file_buf, strtouint32(filename + 1)))
+				die_errno("error opening fd %s", filename + 1);
+		} else {
+			if (buffer_init(&file_buf, filename))
+				die_errno("error opening %s", filename);
+		}
 		input = &file_buf;
 	}
 
diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
index 37ec56e..e29a81a 100644
--- a/vcs-svn/line_buffer.c
+++ b/vcs-svn/line_buffer.c
@@ -17,6 +17,14 @@ int buffer_init(struct line_buffer *buf, const char *filename)
 	return 0;
 }
 
+int buffer_fdinit(struct line_buffer *buf, int fd)
+{
+	buf->infile = fdopen(fd, "r");
+	if (!buf->infile)
+		return -1;
+	return 0;
+}
+
 int buffer_deinit(struct line_buffer *buf)
 {
 	int err;
diff --git a/vcs-svn/line_buffer.h b/vcs-svn/line_buffer.h
index 0a59c73..630d83c 100644
--- a/vcs-svn/line_buffer.h
+++ b/vcs-svn/line_buffer.h
@@ -13,6 +13,7 @@ struct line_buffer {
 #define LINE_BUFFER_INIT {"", STRBUF_INIT, NULL}
 
 int buffer_init(struct line_buffer *buf, const char *filename);
+int buffer_fdinit(struct line_buffer *buf, int fd);
 int buffer_deinit(struct line_buffer *buf);
 char *buffer_read_line(struct line_buffer *buf);
 char *buffer_read_string(struct line_buffer *buf, uint32_t len);
diff --git a/vcs-svn/line_buffer.txt b/vcs-svn/line_buffer.txt
index f8eaa4d..4e8fb71 100644
--- a/vcs-svn/line_buffer.txt
+++ b/vcs-svn/line_buffer.txt
@@ -27,10 +27,11 @@ resources.
 Functions
 ---------
 
-`buffer_init`::
-	Open the named file for input.  If filename is NULL,
-	start reading from stdin.  On failure, returns -1 (with
-	errno indicating the nature of the failure).
+`buffer_init`, `buffer_fdinit`::
+	Open the named file or file descriptor for input.
+	buffer_init(buf, NULL) prepares to read from stdin.
+	On failure, returns -1 (with errno indicating the nature
+	of the failure).
 
 `buffer_deinit`::
 	Stop reading from the current file (closing it unless
-- 
1.7.4.rc0.580.g89dc.dirty

^ permalink raw reply related

* [PATCH 10/12] vcs-svn: allow character-oriented input
From: Jonathan Nieder @ 2011-01-03  3:06 UTC (permalink / raw)
  To: git; +Cc: David Barr, Thomas Rast, Ramkumar Ramachandra
In-Reply-To: <20110103030328.GA10143@burratino>

buffer_read_char can be used in place of buffer_read_string(1) to
avoid consuming valuable static buffer space.  The delta applier will
use this to read variable-length integers one byte at a time.

Underneath, it is fgetc, wrapped so the line_buffer library can
maintain its role as gatekeeper of input.

Later it might be worth checking if fgetc_unlocked is faster ---
most line_buffer functions are not thread-safe anyway.

Helpd-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 vcs-svn/line_buffer.c |    5 +++++
 vcs-svn/line_buffer.h |    1 +
 2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
index 661b007..37ec56e 100644
--- a/vcs-svn/line_buffer.c
+++ b/vcs-svn/line_buffer.c
@@ -27,6 +27,11 @@ int buffer_deinit(struct line_buffer *buf)
 	return err;
 }
 
+int buffer_read_char(struct line_buffer *buf)
+{
+	return fgetc(buf->infile);
+}
+
 /* Read a line without trailing newline. */
 char *buffer_read_line(struct line_buffer *buf)
 {
diff --git a/vcs-svn/line_buffer.h b/vcs-svn/line_buffer.h
index 0c2d3d9..0a59c73 100644
--- a/vcs-svn/line_buffer.h
+++ b/vcs-svn/line_buffer.h
@@ -16,6 +16,7 @@ int buffer_init(struct line_buffer *buf, const char *filename);
 int buffer_deinit(struct line_buffer *buf);
 char *buffer_read_line(struct line_buffer *buf);
 char *buffer_read_string(struct line_buffer *buf, uint32_t len);
+int buffer_read_char(struct line_buffer *buf);
 void buffer_read_binary(struct line_buffer *buf, struct strbuf *sb, uint32_t len);
 void buffer_copy_bytes(struct line_buffer *buf, uint32_t len);
 void buffer_skip_bytes(struct line_buffer *buf, uint32_t len);
-- 
1.7.4.rc0.580.g89dc.dirty

^ permalink raw reply related

* [PATCH 09/12] vcs-svn: add binary-safe read function
From: Jonathan Nieder @ 2011-01-03  3:05 UTC (permalink / raw)
  To: git; +Cc: David Barr, Thomas Rast, Ramkumar Ramachandra
In-Reply-To: <20110103030328.GA10143@burratino>

buffer_read_string works well for non line-oriented input except for
one problem: it does not tell the caller how many bytes were actually
written.  This means that unless one is very careful about checking
for errors (and eof) the calling program cannot tell the difference
between the string "foo" followed by an early end of file and the
string "foo\0bar\0baz".

So introduce a variant that reports the length, too, a thinner wrapper
around strbuf_fread.  Its result is written to a strbuf so the caller
does not need to keep track of the number of bytes read.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 t/t0081-line-buffer.sh |   18 ++++++++++++++++++
 test-line-buffer.c     |   10 ++++++++++
 vcs-svn/line_buffer.c  |    6 ++++++
 vcs-svn/line_buffer.h  |    1 +
 4 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/t/t0081-line-buffer.sh b/t/t0081-line-buffer.sh
index 33a728e..a8eeb20 100755
--- a/t/t0081-line-buffer.sh
+++ b/t/t0081-line-buffer.sh
@@ -151,6 +151,15 @@ test_expect_success 'skip, copy null byte' '
 	test_cmp expect actual
 '
 
+test_expect_success 'read null byte' '
+	echo ">QhelloQ" | q_to_nul >expect &&
+	q_to_nul <<-\EOF | test-line-buffer >actual &&
+	binary 8
+	QhelloQ
+	EOF
+	test_cmp expect actual
+'
+
 test_expect_success 'long reads are truncated' '
 	echo foo >expect &&
 	test-line-buffer <<-\EOF >actual &&
@@ -171,4 +180,13 @@ test_expect_success 'long copies are truncated' '
 	test_cmp expect actual
 '
 
+test_expect_success 'long binary reads are truncated' '
+	echo ">foo" >expect &&
+	test-line-buffer <<-\EOF >actual &&
+	binary 5
+	foo
+	EOF
+	test_cmp expect actual
+'
+
 test_done
diff --git a/test-line-buffer.c b/test-line-buffer.c
index ec19b13..19bf2d4 100644
--- a/test-line-buffer.c
+++ b/test-line-buffer.c
@@ -3,6 +3,7 @@
  */
 
 #include "git-compat-util.h"
+#include "strbuf.h"
 #include "vcs-svn/line_buffer.h"
 
 static uint32_t strtouint32(const char *s)
@@ -17,6 +18,15 @@ static uint32_t strtouint32(const char *s)
 static void handle_command(const char *command, const char *arg, struct line_buffer *buf)
 {
 	switch (*command) {
+	case 'b':
+		if (!prefixcmp(command, "binary ")) {
+			struct strbuf sb = STRBUF_INIT;
+			strbuf_addch(&sb, '>');
+			buffer_read_binary(buf, &sb, strtouint32(arg));
+			fwrite(sb.buf, 1, sb.len, stdout);
+			strbuf_release(&sb);
+			return;
+		}
 	case 'c':
 		if (!prefixcmp(command, "copy ")) {
 			buffer_copy_bytes(buf, strtouint32(arg));
diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
index 806932b..661b007 100644
--- a/vcs-svn/line_buffer.c
+++ b/vcs-svn/line_buffer.c
@@ -56,6 +56,12 @@ char *buffer_read_string(struct line_buffer *buf, uint32_t len)
 	return ferror(buf->infile) ? NULL : buf->blob_buffer.buf;
 }
 
+void buffer_read_binary(struct line_buffer *buf,
+				struct strbuf *sb, uint32_t size)
+{
+	strbuf_fread(sb, size, buf->infile);
+}
+
 void buffer_copy_bytes(struct line_buffer *buf, uint32_t len)
 {
 	char byte_buffer[COPY_BUFFER_LEN];
diff --git a/vcs-svn/line_buffer.h b/vcs-svn/line_buffer.h
index fb37390..0c2d3d9 100644
--- a/vcs-svn/line_buffer.h
+++ b/vcs-svn/line_buffer.h
@@ -16,6 +16,7 @@ int buffer_init(struct line_buffer *buf, const char *filename);
 int buffer_deinit(struct line_buffer *buf);
 char *buffer_read_line(struct line_buffer *buf);
 char *buffer_read_string(struct line_buffer *buf, uint32_t len);
+void buffer_read_binary(struct line_buffer *buf, struct strbuf *sb, uint32_t len);
 void buffer_copy_bytes(struct line_buffer *buf, uint32_t len);
 void buffer_skip_bytes(struct line_buffer *buf, uint32_t len);
 void buffer_reset(struct line_buffer *buf);
-- 
1.7.4.rc0.580.g89dc.dirty

^ permalink raw reply related

* [PATCHES 9-12/12] line_buffer: more wrappers around stdio functions
From: Jonathan Nieder @ 2011-01-03  3:03 UTC (permalink / raw)
  To: git; +Cc: David Barr, Thomas Rast, Ramkumar Ramachandra
In-Reply-To: <20110103004900.GA30506@burratino>

This might be the last batch of line_buffer patches for the moment[1].
It introduces wrappers around strbuf_fread, fgetc, fdopen, and tmpfile
so the line_buffer lib can do what those functions do.

The motivation in the background is delta application.  We need:

 - binary-safe input, since svn (like most version control systems)
   gets used to store binary files from time to time.
 - character-oriented input (fgetc) as a basic convenience, needed in
   particular for reading variable-length integers in svndiff blocks.
 - input from file descriptors, to read information requested from
   fast-import (in particular, delta preimages).
 - temporary files, to store the result of delta application and
   retrieve its length so svn-fe can write

	data <length>
	... delta application result ...

   to fast-import.

The ideas behind the third and fourth patches (patches 11 and 12) are
from David Barr's earlier work in the same direction.  Patches are
based against

  [PATCH 8/8] t0081 (line-buffer): add buffering tests

so we can reuse some of the testing infrastructure.  They are numbered
accordingly for easy application.

Each patch introduces new API.  I would be happy if you can find an
infelicity or two so we can fix the functions now before people get
used to them.

Jonathan Nieder (4):
  vcs-svn: add binary-safe read function
  vcs-svn: allow character-oriented input
  vcs-svn: allow input from file descriptor
  vcs-svn: teach line_buffer about temporary files

 t/t0081-line-buffer.sh  |   27 +++++++++++++++++++++++++++
 test-line-buffer.c      |   21 ++++++++++++++++++---
 vcs-svn/line_buffer.c   |   43 +++++++++++++++++++++++++++++++++++++++++++
 vcs-svn/line_buffer.h   |   10 +++++++++-
 vcs-svn/line_buffer.txt |   31 +++++++++++++++++++++++++++----
 5 files changed, 124 insertions(+), 8 deletions(-)

[1] There's another patch to use 64-bit offsets but the API changes
are more obvious.

^ permalink raw reply

* Re: [PATCH 8/8] t0081 (line-buffer): add buffering tests
From: Jonathan Nieder @ 2011-01-03  1:34 UTC (permalink / raw)
  To: git; +Cc: David Barr, Thomas Rast, Ramkumar Ramachandra
In-Reply-To: <20110103010716.GE30506@burratino>

Jonathan Nieder wrote:

> First check that fread can handle a 0-length read from an empty fifo.
> The writing end of the fifo is opened in advance in a subshell since
> even 0-length reads are allowed to block when the writing end of a
> pipe is not open.

Erm, what I mean here is that open(O_RDONLY) will block until the
writing end has been opened.

^ permalink raw reply

* Re: gitattributes don't work
From: Jonathan Nieder @ 2011-01-03  1:11 UTC (permalink / raw)
  To: Marcin Wiśnicki; +Cc: git, Nguyễn Thái Ngọc Duy
In-Reply-To: <ifr610$3kl$1@dough.gmane.org>

Marcin Wiśnicki wrote:

> I'm trying to exclude certain paths (those that contain "xmac/gen/") from 
> diff output using .git/info/attributes (not .gitattributes).

Tricky.  Have you tried

 xmac?gen? -diff

?  You might also be interested in the nd/struct-pathspec branch:

 git clone git://repo.or.cz/git.git
 cd git
 git log --grep=nd/struct-pathspec

for some work and explanation on patterns used to specify paths.

Regards,
Jonathan

^ permalink raw reply

* [PATCH 8/8] t0081 (line-buffer): add buffering tests
From: Jonathan Nieder @ 2011-01-03  1:07 UTC (permalink / raw)
  To: git; +Cc: David Barr, Thomas Rast, Ramkumar Ramachandra
In-Reply-To: <20110103004900.GA30506@burratino>

POSIX makes the behavior of read(2) from a pipe fairly clear: a read
from an empty pipe will block until there is data available and any
other read will not block, prefering to return a partial result.
Likewise, fread(3) and fgets(3) are clearly specified to act as
though implemented by calling fgetc(3) in a simple loop.  But the
buffering behavior of fgetc is less clear.

Luckily, no sane platform is going to implement fgetc by calling the
equivalent of read(2) more than once.  fgetc has to be able to
return without filling its buffer to preserve errno when errors are
encountered anyway.  So let's assume the simpler behavior (trust) but
add some tests to catch insane platforms that violate that when they
come (verify).

First check that fread can handle a 0-length read from an empty fifo.
The writing end of the fifo is opened in advance in a subshell since
even 0-length reads are allowed to block when the writing end of a
pipe is not open.

Next try short inputs from a pipe that is not filled all the way.

Lastly (two tests) try very large inputs from a pipe that will not fit
in the relevant buffers.  The first of these tests reads a little
more than 8192 bytes, which is BUFSIZ (the size of stdio's buffers)
on this Linux machine.  The second reads a little over 64 KiB (the
pipe capacity on Linux) and is not run unless requested by setting
the GIT_REMOTE_SVN_TEST_BIG_FILES environment variable.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 t/t0081-line-buffer.sh |  110 +++++++++++++++++++++++++++++++++++++++++++++++-
 test-line-buffer.c     |   22 ++++++++-
 2 files changed, 128 insertions(+), 4 deletions(-)

diff --git a/t/t0081-line-buffer.sh b/t/t0081-line-buffer.sh
index 68d6163..33a728e 100755
--- a/t/t0081-line-buffer.sh
+++ b/t/t0081-line-buffer.sh
@@ -1,10 +1,76 @@
 #!/bin/sh
 
 test_description="Test the svn importer's input handling routines.
+
+These tests exercise the line_buffer library, but their real purpose
+is to check the assumptions that library makes of the platform's input
+routines.  Processes engaged in bi-directional communication would
+hang if fread or fgets is too greedy.
+
+While at it, check that input of newlines and null bytes are handled
+correctly.
 "
 . ./test-lib.sh
 
-test_expect_success 'read greeting' '
+test -n "$GIT_REMOTE_SVN_TEST_BIG_FILES" && test_set_prereq EXPENSIVE
+
+generate_tens_of_lines () {
+	tens=$1 &&
+	line=$2 &&
+
+	i=0 &&
+	while test $i -lt "$tens"
+	do
+		for j in a b c d e f g h i j
+		do
+			echo "$line"
+		done &&
+		: $((i = $i + 1)) ||
+		return
+	done
+}
+
+long_read_test () {
+	: each line is 10 bytes, including newline &&
+	line=abcdefghi &&
+	echo "$line" >expect &&
+
+	if ! test_declared_prereq PIPE
+	then
+		echo >&4 "long_read_test: need to declare PIPE prerequisite"
+		return 127
+	fi &&
+	tens_of_lines=$(($1 / 100 + 1)) &&
+	lines=$(($tens_of_lines * 10)) &&
+	readsize=$((($lines - 1) * 10 + 3)) &&
+	copysize=7 &&
+	rm -f input &&
+	mkfifo input &&
+	{
+		{
+			generate_tens_of_lines $tens_of_lines "$line" &&
+			sleep 100
+		} >input &
+	} &&
+	test-line-buffer input <<-EOF >output &&
+	read $readsize
+	copy $copysize
+	EOF
+	kill $! &&
+	test_line_count = $lines output &&
+	tail -n 1 <output >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'setup: have pipes?' '
+      rm -f frob &&
+      if mkfifo frob
+      then
+		test_set_prereq PIPE
+      fi
+'
+
+test_expect_success 'hello world' '
 	echo HELLO >expect &&
 	test-line-buffer <<-\EOF >actual &&
 	read 6
@@ -13,6 +79,21 @@ test_expect_success 'read greeting' '
 	test_cmp expect actual
 '
 
+test_expect_success PIPE '0-length read, no input available' '
+	>expect &&
+	rm -f input &&
+	mkfifo input &&
+	{
+		sleep 100 >input &
+	} &&
+	test-line-buffer input <<-\EOF >actual &&
+	read 0
+	copy 0
+	EOF
+	kill $! &&
+	test_cmp expect actual
+'
+
 test_expect_success '0-length read, send along greeting' '
 	echo HELLO >expect &&
 	test-line-buffer <<-\EOF >actual &&
@@ -23,6 +104,33 @@ test_expect_success '0-length read, send along greeting' '
 	test_cmp expect actual
 '
 
+test_expect_success PIPE '1-byte read, no input available' '
+	printf "%s" ab >expect &&
+	rm -f input &&
+	mkfifo input &&
+	{
+		{
+			printf "%s" a &&
+			printf "%s" b &&
+			sleep 100
+		} >input &
+	} &&
+	test-line-buffer input <<-\EOF >actual &&
+	read 1
+	copy 1
+	EOF
+	kill $! &&
+	test_cmp expect actual
+'
+
+test_expect_success PIPE 'long read (around 8192 bytes)' '
+	long_read_test 8192
+'
+
+test_expect_success PIPE,EXPENSIVE 'longer read (around 65536 bytes)' '
+	long_read_test 65536
+'
+
 test_expect_success 'buffer_read_string copes with null byte' '
 	>expect &&
 	q_to_nul <<-\EOF | test-line-buffer >actual &&
diff --git a/test-line-buffer.c b/test-line-buffer.c
index da0bc65..ec19b13 100644
--- a/test-line-buffer.c
+++ b/test-line-buffer.c
@@ -49,15 +49,31 @@ static void handle_line(const char *line, struct line_buffer *stdin_buf)
 int main(int argc, char *argv[])
 {
 	struct line_buffer stdin_buf = LINE_BUFFER_INIT;
+	struct line_buffer file_buf = LINE_BUFFER_INIT;
+	struct line_buffer *input = &stdin_buf;
+	const char *filename;
 	char *s;
 
-	if (argc != 1)
-		usage("test-line-buffer < script");
+	if (argc == 1)
+		filename = NULL;
+	else if (argc == 2)
+		filename = argv[1];
+	else
+		usage("test-line-buffer [file] < script");
 
 	if (buffer_init(&stdin_buf, NULL))
 		die_errno("open error");
+	if (filename) {
+		if (buffer_init(&file_buf, filename))
+			die_errno("error opening %s", filename);
+		input = &file_buf;
+	}
+
 	while ((s = buffer_read_line(&stdin_buf)))
-		handle_line(s, &stdin_buf);
+		handle_line(s, input);
+
+	if (filename && buffer_deinit(&file_buf))
+		die("error reading from %s", filename);
 	if (buffer_deinit(&stdin_buf))
 		die("input error");
 	if (ferror(stdout))
-- 
1.7.4.rc0.580.g89dc.dirty

^ permalink raw reply related

* [PATCH 7/8] vcs-svn: tweak test-line-buffer to not assume line-oriented input
From: Jonathan Nieder @ 2011-01-03  0:52 UTC (permalink / raw)
  To: git; +Cc: David Barr, Thomas Rast, Ramkumar Ramachandra
In-Reply-To: <20110103004900.GA30506@burratino>

Do not expect an implicit newline after each input record.
Use a separate command to exercise buffer_skip_bytes.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 t/t0081-line-buffer.sh |   27 +++++++++++++--------------
 test-line-buffer.c     |   10 +++++++---
 2 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/t/t0081-line-buffer.sh b/t/t0081-line-buffer.sh
index 13ac735..68d6163 100755
--- a/t/t0081-line-buffer.sh
+++ b/t/t0081-line-buffer.sh
@@ -7,45 +7,44 @@ test_description="Test the svn importer's input handling routines.
 test_expect_success 'read greeting' '
 	echo HELLO >expect &&
 	test-line-buffer <<-\EOF >actual &&
-	read 5
+	read 6
 	HELLO
 	EOF
 	test_cmp expect actual
 '
 
 test_expect_success '0-length read, send along greeting' '
-	printf "%s\n" "" HELLO >expect &&
+	echo HELLO >expect &&
 	test-line-buffer <<-\EOF >actual &&
 	read 0
-
-	copy 5
+	copy 6
 	HELLO
 	EOF
 	test_cmp expect actual
 '
 
-test_expect_success 'buffer_read_string copes with trailing null byte' '
-	echo >expect &&
+test_expect_success 'buffer_read_string copes with null byte' '
+	>expect &&
 	q_to_nul <<-\EOF | test-line-buffer >actual &&
-	read 1
+	read 2
 	Q
 	EOF
 	test_cmp expect actual
 '
 
-test_expect_success '0-length read, copy null byte' '
-	printf "%s\n" "" Q | q_to_nul >expect &&
+test_expect_success 'skip, copy null byte' '
+	echo Q | q_to_nul >expect &&
 	q_to_nul <<-\EOF | test-line-buffer >actual &&
-	read 0
-
-	copy 1
+	skip 2
+	Q
+	copy 2
 	Q
 	EOF
 	test_cmp expect actual
 '
 
 test_expect_success 'long reads are truncated' '
-	printf "%s\n" foo "" >expect &&
+	echo foo >expect &&
 	test-line-buffer <<-\EOF >actual &&
 	read 5
 	foo
@@ -56,7 +55,7 @@ test_expect_success 'long reads are truncated' '
 test_expect_success 'long copies are truncated' '
 	printf "%s\n" "" foo >expect &&
 	test-line-buffer <<-\EOF >actual &&
-	read 0
+	read 1
 
 	copy 5
 	foo
diff --git a/test-line-buffer.c b/test-line-buffer.c
index 383f35b..da0bc65 100644
--- a/test-line-buffer.c
+++ b/test-line-buffer.c
@@ -19,14 +19,18 @@ static void handle_command(const char *command, const char *arg, struct line_buf
 	switch (*command) {
 	case 'c':
 		if (!prefixcmp(command, "copy ")) {
-			buffer_copy_bytes(buf, strtouint32(arg) + 1);
+			buffer_copy_bytes(buf, strtouint32(arg));
 			return;
 		}
 	case 'r':
 		if (!prefixcmp(command, "read ")) {
 			const char *s = buffer_read_string(buf, strtouint32(arg));
-			printf("%s\n", s);
-			buffer_skip_bytes(buf, 1);	/* consume newline */
+			fputs(s, stdout);
+			return;
+		}
+	case 's':
+		if (!prefixcmp(command, "skip ")) {
+			buffer_skip_bytes(buf, strtouint32(arg));
 			return;
 		}
 	default:
-- 
1.7.4.rc0.580.g89dc.dirty

^ permalink raw reply related

* [PATCH 6/8] tests: give vcs-svn/line_buffer its own test script
From: Jonathan Nieder @ 2011-01-03  0:51 UTC (permalink / raw)
  To: git; +Cc: David Barr, Thomas Rast, Ramkumar Ramachandra
In-Reply-To: <20110103004900.GA30506@burratino>

Split the line_buffer test into small pieces and move it to its
own file as preparation for adding more tests.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 t/t0080-vcs-svn.sh     |   54 --------------------------------------
 t/t0081-line-buffer.sh |   67 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 67 insertions(+), 54 deletions(-)
 create mode 100755 t/t0081-line-buffer.sh

diff --git a/t/t0080-vcs-svn.sh b/t/t0080-vcs-svn.sh
index 8be9700..99a314b 100755
--- a/t/t0080-vcs-svn.sh
+++ b/t/t0080-vcs-svn.sh
@@ -76,60 +76,6 @@ test_expect_success 'obj pool: high-water mark' '
 	test_cmp expected actual
 '
 
-test_expect_success 'line buffer' '
-	echo HELLO >expected1 &&
-	printf "%s\n" "" HELLO >expected2 &&
-	echo >expected3 &&
-	printf "%s\n" "" Q | q_to_nul >expected4 &&
-	printf "%s\n" foo "" >expected5 &&
-	printf "%s\n" "" foo >expected6 &&
-
-	test-line-buffer <<-\EOF >actual1 &&
-	read 5
-	HELLO
-	EOF
-
-	test-line-buffer <<-\EOF >actual2 &&
-	read 0
-
-	copy 5
-	HELLO
-	EOF
-
-	q_to_nul <<-\EOF |
-	read 1
-	Q
-	EOF
-	test-line-buffer >actual3 &&
-
-	q_to_nul <<-\EOF |
-	read 0
-
-	copy 1
-	Q
-	EOF
-	test-line-buffer >actual4 &&
-
-	test-line-buffer <<-\EOF >actual5 &&
-	read 5
-	foo
-	EOF
-
-	test-line-buffer <<-\EOF >actual6 &&
-	read 0
-
-	copy 5
-	foo
-	EOF
-
-	test_cmp expected1 actual1 &&
-	test_cmp expected2 actual2 &&
-	test_cmp expected3 actual3 &&
-	test_cmp expected4 actual4 &&
-	test_cmp expected5 actual5 &&
-	test_cmp expected6 actual6
-'
-
 test_expect_success 'string pool' '
 	echo a does not equal b >expected.differ &&
 	echo a equals a >expected.match &&
diff --git a/t/t0081-line-buffer.sh b/t/t0081-line-buffer.sh
new file mode 100755
index 0000000..13ac735
--- /dev/null
+++ b/t/t0081-line-buffer.sh
@@ -0,0 +1,67 @@
+#!/bin/sh
+
+test_description="Test the svn importer's input handling routines.
+"
+. ./test-lib.sh
+
+test_expect_success 'read greeting' '
+	echo HELLO >expect &&
+	test-line-buffer <<-\EOF >actual &&
+	read 5
+	HELLO
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success '0-length read, send along greeting' '
+	printf "%s\n" "" HELLO >expect &&
+	test-line-buffer <<-\EOF >actual &&
+	read 0
+
+	copy 5
+	HELLO
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'buffer_read_string copes with trailing null byte' '
+	echo >expect &&
+	q_to_nul <<-\EOF | test-line-buffer >actual &&
+	read 1
+	Q
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success '0-length read, copy null byte' '
+	printf "%s\n" "" Q | q_to_nul >expect &&
+	q_to_nul <<-\EOF | test-line-buffer >actual &&
+	read 0
+
+	copy 1
+	Q
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'long reads are truncated' '
+	printf "%s\n" foo "" >expect &&
+	test-line-buffer <<-\EOF >actual &&
+	read 5
+	foo
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'long copies are truncated' '
+	printf "%s\n" "" foo >expect &&
+	test-line-buffer <<-\EOF >actual &&
+	read 0
+
+	copy 5
+	foo
+	EOF
+	test_cmp expect actual
+'
+
+test_done
-- 
1.7.4.rc0.580.g89dc.dirty

^ permalink raw reply related

* [PATCH 5/8] vcs-svn: make test-line-buffer input format more flexible
From: Jonathan Nieder @ 2011-01-03  0:50 UTC (permalink / raw)
  To: git; +Cc: David Barr, Thomas Rast, Ramkumar Ramachandra
In-Reply-To: <20110103004900.GA30506@burratino>

Imitate the input format of test-obj-pool to support arbitrary
sequences of commands rather than alternating read/copy.  This should
make it easier to add tests that exercise other line_buffer functions.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 t/t0080-vcs-svn.sh |   18 ++++++++--------
 test-line-buffer.c |   56 +++++++++++++++++++++++++++++++++------------------
 2 files changed, 45 insertions(+), 29 deletions(-)

diff --git a/t/t0080-vcs-svn.sh b/t/t0080-vcs-svn.sh
index d3225ad..8be9700 100755
--- a/t/t0080-vcs-svn.sh
+++ b/t/t0080-vcs-svn.sh
@@ -85,40 +85,40 @@ test_expect_success 'line buffer' '
 	printf "%s\n" "" foo >expected6 &&
 
 	test-line-buffer <<-\EOF >actual1 &&
-	5
+	read 5
 	HELLO
 	EOF
 
 	test-line-buffer <<-\EOF >actual2 &&
-	0
+	read 0
 
-	5
+	copy 5
 	HELLO
 	EOF
 
 	q_to_nul <<-\EOF |
-	1
+	read 1
 	Q
 	EOF
 	test-line-buffer >actual3 &&
 
 	q_to_nul <<-\EOF |
-	0
+	read 0
 
-	1
+	copy 1
 	Q
 	EOF
 	test-line-buffer >actual4 &&
 
 	test-line-buffer <<-\EOF >actual5 &&
-	5
+	read 5
 	foo
 	EOF
 
 	test-line-buffer <<-\EOF >actual6 &&
-	0
+	read 0
 
-	5
+	copy 5
 	foo
 	EOF
 
diff --git a/test-line-buffer.c b/test-line-buffer.c
index f9af892..383f35b 100644
--- a/test-line-buffer.c
+++ b/test-line-buffer.c
@@ -1,11 +1,5 @@
 /*
  * test-line-buffer.c: code to exercise the svn importer's input helper
- *
- * Input format:
- *	number NL
- *	(number bytes) NL
- *	number NL
- *	...
  */
 
 #include "git-compat-util.h"
@@ -20,28 +14,50 @@ static uint32_t strtouint32(const char *s)
 	return (uint32_t) n;
 }
 
+static void handle_command(const char *command, const char *arg, struct line_buffer *buf)
+{
+	switch (*command) {
+	case 'c':
+		if (!prefixcmp(command, "copy ")) {
+			buffer_copy_bytes(buf, strtouint32(arg) + 1);
+			return;
+		}
+	case 'r':
+		if (!prefixcmp(command, "read ")) {
+			const char *s = buffer_read_string(buf, strtouint32(arg));
+			printf("%s\n", s);
+			buffer_skip_bytes(buf, 1);	/* consume newline */
+			return;
+		}
+	default:
+		die("unrecognized command: %s", command);
+	}
+}
+
+static void handle_line(const char *line, struct line_buffer *stdin_buf)
+{
+	const char *arg = strchr(line, ' ');
+	if (!arg)
+		die("no argument in line: %s", line);
+	handle_command(line, arg + 1, stdin_buf);
+}
+
 int main(int argc, char *argv[])
 {
-	struct line_buffer buf = LINE_BUFFER_INIT;
+	struct line_buffer stdin_buf = LINE_BUFFER_INIT;
 	char *s;
 
 	if (argc != 1)
-		usage("test-line-buffer < input.txt");
-	if (buffer_init(&buf, NULL))
+		usage("test-line-buffer < script");
+
+	if (buffer_init(&stdin_buf, NULL))
 		die_errno("open error");
-	while ((s = buffer_read_line(&buf))) {
-		s = buffer_read_string(&buf, strtouint32(s));
-		fputs(s, stdout);
-		fputc('\n', stdout);
-		buffer_skip_bytes(&buf, 1);
-		if (!(s = buffer_read_line(&buf)))
-			break;
-		buffer_copy_bytes(&buf, strtouint32(s) + 1);
-	}
-	if (buffer_deinit(&buf))
+	while ((s = buffer_read_line(&stdin_buf)))
+		handle_line(s, &stdin_buf);
+	if (buffer_deinit(&stdin_buf))
 		die("input error");
 	if (ferror(stdout))
 		die("output error");
-	buffer_reset(&buf);
+	buffer_reset(&stdin_buf);
 	return 0;
 }
-- 
1.7.4.rc0.580.g89dc.dirty

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox