git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] builtin-apply: check for empty files when detecting creation patch
@ 2008-05-08 14:39 Imre Deak
  2008-05-11  2:36 ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Imre Deak @ 2008-05-08 14:39 UTC (permalink / raw)
  To: git; +Cc: imre.deak

When we can only guess if we have a creation patch, we do
this by treating the patch as such if there weren't any old
lines. Zero length files can be patched without old lines
though, so do an extra check for file size.

Signed-off-by: Imre Deak <imre.deak@gmail.com>
---
 builtin-apply.c |   34 +++++++++++++++++++++++++++++++++-
 1 files changed, 33 insertions(+), 1 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index caa3f2a..80d0779 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1096,6 +1096,33 @@ static int parse_fragment(char *line, unsigned long size,
 	return offset;
 }
 
+static size_t existing_file_size(const char *file)
+{
+	size_t st_size = -1;
+
+	if (file == NULL)
+		return -1;
+
+	if (cached) {
+		struct cache_entry *ce;
+		int pos;
+
+		pos = cache_name_pos(file, strlen(file));
+		if (pos < 0)
+			return -1;
+		ce = active_cache[pos];
+		st_size = ntohl(ce->ce_size);
+	} else {
+		struct stat st;
+
+		if (lstat(file, &st) < 0)
+			return -1;
+		st_size = st.st_size;
+	}
+
+	return st_size;
+}
+
 static int parse_single_patch(char *line, unsigned long size, struct patch *patch)
 {
 	unsigned long offset = 0;
@@ -1143,13 +1170,18 @@ static int parse_single_patch(char *line, unsigned long size, struct patch *patc
 	if (patch->is_delete < 0 &&
 	    (newlines || (patch->fragments && patch->fragments->next)))
 		patch->is_delete = 0;
+	/* FIXME: How can be there context if it's a creation / deletion? */
 	if (!unidiff_zero || context) {
 		/* If the user says the patch is not generated with
 		 * --unified=0, or if we have seen context lines,
 		 * then not having oldlines means the patch is creation,
 		 * and not having newlines means the patch is deletion.
+		 *
+		 * It's also possible that a zero length file is added
+		 * to.
 		 */
-		if (patch->is_new < 0 && !oldlines) {
+		if (patch->is_new < 0 && !oldlines &&
+		    existing_file_size(patch->old_name) != 0) {
 			patch->is_new = 1;
 			patch->old_name = NULL;
 		}
-- 
1.5.4.2.dirty

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] builtin-apply: check for empty files when detecting creation patch
  2008-05-08 14:39 [PATCH] builtin-apply: check for empty files when detecting creation patch Imre Deak
@ 2008-05-11  2:36 ` Junio C Hamano
  2008-05-13 20:16   ` Imre Deak
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2008-05-11  2:36 UTC (permalink / raw)
  To: Imre Deak; +Cc: git

Imre Deak <imre.deak@gmail.com> writes:

> When we can only guess if we have a creation patch, we do
> this by treating the patch as such if there weren't any old
> lines. Zero length files can be patched without old lines
> though, so do an extra check for file size.

You described what your patch does, but you did not explain why it is a
good addition.  One way to do so is to illustrate in what occasion what
the existing code does is insufficient.

> +static size_t existing_file_size(const char *file)
> +{
> +	size_t st_size = -1;
> +
> +	if (file == NULL)
> +		return -1;
> +	if (cached) {
> +		struct cache_entry *ce;
> +		int pos;
> +
> +		pos = cache_name_pos(file, strlen(file));
> +		if (pos < 0)
> +			return -1;
> +		ce = active_cache[pos];
> +		st_size = ntohl(ce->ce_size);

ntohl()?  I thought ce->ce_* are host-native byte order these days...

> +	} else {
> +		struct stat st;
> +
> +		if (lstat(file, &st) < 0)
> +			return -1;

Doesn't this break the use case where "git-apply --stat" is used as an
improved diffstat outside a git repository?

> @@ -1143,13 +1170,18 @@ static int parse_single_patch(char *line, unsigned long size, struct patch *patc
>  	if (patch->is_delete < 0 &&
>  	    (newlines || (patch->fragments && patch->fragments->next)))
>  		patch->is_delete = 0;
> +	/* FIXME: How can be there context if it's a creation / deletion? */
>  	if (!unidiff_zero || context) {
>  		/* If the user says the patch is not generated with
>  		 * --unified=0, or if we have seen context lines,
>  		 * then not having oldlines means the patch is creation,
>  		 * and not having newlines means the patch is deletion.
> +		 *
> +		 * It's also possible that a zero length file is added
> +		 * to.
>  		 */
> -		if (patch->is_new < 0 && !oldlines) {
> +		if (patch->is_new < 0 && !oldlines &&
> +		    existing_file_size(patch->old_name) != 0) {
>  			patch->is_new = 1;
>  			patch->old_name = NULL;
>  		}

The user did not say the patch was produced without context, or we do have
context.  The latter cannot be a creation patch so the new logic is not
appropriate.  But let's forget that problem for now and look at the case
where the patch did _not_ have any context, i.e. only added and deleted
lines.

If the patch did not have context, and the user did not ask for -u0 patch
when it was produced, it could be a creation patch, but if there are
deleted lines it cannot be.  That is the original logic.

After your patch, the original logic is allowed to decide that the patch
is a creation _only if_ you happen to already have a file that is _to be
created_ in the work tree with some existing contents, or the file does
not exist.  I do not see a sane logic behind that.  If you were making
sure that the work tree does _not_ have the file, then I would understand,
even though I think it is wrong for "apply --stat" case.  If you see a
file in the work tree, and if you assume the patch would apply to the
work tree, then the patch cannot be creation!

In general, it is not right to look at the work tree to decide how to
interpret what the patch means to begin with, but maybe you are trying to
use work tree status as a hint to disambiguate a corner case that the
information in a patch we are reading is insufficient, in which case it
might be Ok.  But I cannot tell what that corner case is.

I am lost.  Please explain what you are trying to fix first before
explaining how you attempted to fix it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] builtin-apply: check for empty files when detecting creation patch
  2008-05-11  2:36 ` Junio C Hamano
@ 2008-05-13 20:16   ` Imre Deak
  2008-05-13 21:48     ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Imre Deak @ 2008-05-13 20:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi,

thanks for your answer.

On Sun, May 11, 2008 at 5:36 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Imre Deak <imre.deak@gmail.com> writes:
>
>  > When we can only guess if we have a creation patch, we do
>  > this by treating the patch as such if there weren't any old
>  > lines. Zero length files can be patched without old lines
>  > though, so do an extra check for file size.
>
>  You described what your patch does, but you did not explain why it is a
>  good addition.  One way to do so is to illustrate in what occasion what
>  the existing code does is insufficient.

The patch makes it possible to apply foreign patches (not created with
git diff) to zero length files already existing in the index. The problem:

$ git init
Initialized empty Git repository in .git/
$ rm -rf a
$ touch a
$ git add a
$ git commit -madd
Created initial commit 818f2b7: add
 0 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 a
$ echo 123 > a.new
$ diff -u a a.new > patch
$ git apply patch
error: a: already exists in working directory

The error happens because git guesses that `patch` is a creation patch
and since `a` already exists in the index it will bail out. With the
modification
foreign patches won't be guessed to be a creation patch if the file to be
patch already exists in the index or the working directory.

>
>
>  > +static size_t existing_file_size(const char *file)
>  > +{
>  > +     size_t st_size = -1;
>  > +
>  > +     if (file == NULL)
>  > +             return -1;
>  > +     if (cached) {
>  > +             struct cache_entry *ce;
>  > +             int pos;
>  > +
>  > +             pos = cache_name_pos(file, strlen(file));
>  > +             if (pos < 0)
>  > +                     return -1;
>  > +             ce = active_cache[pos];
>  > +             st_size = ntohl(ce->ce_size);
>
>  ntohl()?  I thought ce->ce_* are host-native byte order these days...

Sorry, I actually created the patch against an older version. It should be
without ntohl().

>
>
>  > +     } else {
>  > +             struct stat st;
>  > +
>  > +             if (lstat(file, &st) < 0)
>  > +                     return -1;
>
>  Doesn't this break the use case where "git-apply --stat" is used as an
>  improved diffstat outside a git repository?

In that case we'll return error (file doesn't exist) which will leave
the possibility
open to guess the patch to be a creation patch. With the change when
doing --stat the patch might or might not be changed to a creation patch
based on the current working directory content. Though I think this doesn't
result in any difference in the --stat output one could argue that --stat's
operation shouldn't depend in any way on the working directory content
except when --check is specified in addition. So probably we should special
case --stat here...

>
>
>  > @@ -1143,13 +1170,18 @@ static int parse_single_patch(char *line, unsigned long size, struct patch *patc
>  >       if (patch->is_delete < 0 &&
>  >           (newlines || (patch->fragments && patch->fragments->next)))
>  >               patch->is_delete = 0;
>  > +     /* FIXME: How can be there context if it's a creation / deletion? */
>  >       if (!unidiff_zero || context) {
>  >               /* If the user says the patch is not generated with
>  >                * --unified=0, or if we have seen context lines,
>  >                * then not having oldlines means the patch is creation,
>  >                * and not having newlines means the patch is deletion.
>  > +              *
>  > +              * It's also possible that a zero length file is added
>  > +              * to.
>  >                */
>  > -             if (patch->is_new < 0 && !oldlines) {
>  > +             if (patch->is_new < 0 && !oldlines &&
>  > +                 existing_file_size(patch->old_name) != 0) {
>  >                       patch->is_new = 1;
>  >                       patch->old_name = NULL;
>  >               }
>
>  The user did not say the patch was produced without context, or we do have
>  context.  The latter cannot be a creation patch so the new logic is not
>  appropriate.  But let's forget that problem for now and look at the case
>  where the patch did _not_ have any context, i.e. only added and deleted
>  lines.
>
>  If the patch did not have context, and the user did not ask for -u0 patch
>  when it was produced, it could be a creation patch, but if there are
>  deleted lines it cannot be.  That is the original logic.

So as far as I understand creation or deletion patches cannot have any context
lines. If that's correct shouldn't the if clause

if (!unidiff_zero || context)

read instead

if (!unidiff_zero)

?

>
>  After your patch, the original logic is allowed to decide that the patch
>  is a creation _only if_ you happen to already have a file that is _to be
>  created_ in the work tree with some existing contents, or the file does
>  not exist.  I do not see a sane logic behind that.  If you were making
>  sure that the work tree does _not_ have the file, then I would understand,
>  even though I think it is wrong for "apply --stat" case.  If you see a
>  file in the work tree, and if you assume the patch would apply to the
>  work tree, then the patch cannot be creation!

You are right, that condition is wrong. The right one then should be:

existing_file_size(patch->old_name) < 0

>
>  In general, it is not right to look at the work tree to decide how to
>  interpret what the patch means to begin with, but maybe you are trying to
>  use work tree status as a hint to disambiguate a corner case that the
>  information in a patch we are reading is insufficient, in which case it
>  might be Ok.  But I cannot tell what that corner case is.

Hm, I agree with that if it's only a --stat, but if it's --check or
--apply I think
there is no other way to guess if we really have a creation patch or only an
add-to-a-zero-length-file patch.

--Imre

>
>  I am lost.  Please explain what you are trying to fix first before
>  explaining how you attempted to fix it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] builtin-apply: check for empty files when detecting creation patch
  2008-05-13 20:16   ` Imre Deak
@ 2008-05-13 21:48     ` Junio C Hamano
  2008-05-13 22:24       ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2008-05-13 21:48 UTC (permalink / raw)
  To: Imre Deak, Linus Torvalds; +Cc: git

"Imre Deak" <imre.deak@gmail.com> writes:

> On Sun, May 11, 2008 at 5:36 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Imre Deak <imre.deak@gmail.com> writes:
>>
>>  > When we can only guess if we have a creation patch, we do
>>  > this by treating the patch as such if there weren't any old
>>  > lines. Zero length files can be patched without old lines
>>  > though, so do an extra check for file size.
>>
>>  You described what your patch does, but you did not explain why it is a
>>  good addition.  One way to do so is to illustrate in what occasion what
>>  the existing code does is insufficient.
>
> The patch makes it possible to apply foreign patches (not created with
> git diff) to zero length files already existing in the index. The problem:
>
> $ git init
> Initialized empty Git repository in .git/
> $ rm -rf a
> $ touch a
> $ git add a
> $ git commit -madd
> Created initial commit 818f2b7: add
>  0 files changed, 0 insertions(+), 0 deletions(-)
>  create mode 100644 a
> $ echo 123 > a.new
> $ diff -u a a.new > patch
> $ git apply patch
> error: a: already exists in working directory
>
> The error happens because git guesses that `patch` is a creation patch
> and since `a` already exists in the index it will bail out.

Ok, that is much clearer.  How about this two-liner instead, then (the
first hunk is just an unreleated typofix)?

Originally when Linus wrote "git-apply" he was trying to be very cautious
and used the /dev/null cue only in a positive way (i.e. if the patch is
from /dev/null it is a creation).  But the preimage being something other
than /dev/null was not used as a statement that the patch is not a
creation.

Non SCM patches of any nontrivial size would be created by comparing two
trees with "diff -ru" (with some more combination of options).  We would
reliably use /dev/null cue in this case --- if the patch is from /dev/null
it is creation but more importantly if it is not from /dev/null it is not
creation but modification.

Patches from foreign SCMs also follow the /dev/null convention (e.g. SVN
and CVS --- I did not check Hg but I would be surprised if it didn't
follow suit), and we can reliably use the lack of /dev/null as a cue that
it is not a creation patch.

A single-file non SCM patches are done by comparing $a.orig and $a, $a~
and $a, etc., i.e. a pair of files the original and the modified with some
file suffixes.  What would people do to represent a creation in such a
case?  --- You are right.  By doing "diff -u /dev/null $a" (it is more
cumbersome to do "touch $a.empty; diff -u $a.empty $a").

So I think it is reasonable to use non-/dev/null-ness of first as a cue
that it is not a creation patch.

Linus, what do you think?  FYR, the original patch that led to this
conversation was:

    http://thread.gmane.org/gmane.comp.version-control.git/81531

---

 builtin-apply.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index 1103625..9d7cb05 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -418,7 +418,7 @@ static int guess_p_value(const char *nameline)
 }
 
 /*
- * Get the name etc info from the --/+++ lines of a traditional patch header
+ * Get the name etc info from the ---/+++ lines of a traditional patch header
  *
  * FIXME! The end-of-filename heuristics are kind of screwy. For existing
  * files, we can happily check the index for a match, but for creating a
@@ -451,6 +451,8 @@ static void parse_traditional_patch(const char *first, const char *second, struc
 		name = find_name(first, NULL, p_value, TERM_SPACE | TERM_TAB);
 		patch->old_name = name;
 	} else {
+		patch->is_new = 0;
+		patch->is_delete = 0;
 		name = find_name(first, NULL, p_value, TERM_SPACE | TERM_TAB);
 		name = find_name(second, name, p_value, TERM_SPACE | TERM_TAB);
 		patch->old_name = patch->new_name = name;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] builtin-apply: check for empty files when detecting creation patch
  2008-05-13 21:48     ` Junio C Hamano
@ 2008-05-13 22:24       ` Linus Torvalds
  2008-05-13 22:34         ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2008-05-13 22:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Imre Deak, git



On Tue, 13 May 2008, Junio C Hamano wrote:
> 
> So I think it is reasonable to use non-/dev/null-ness of first as a cue
> that it is not a creation patch.

I disagree.

The fact is, /dev/null means that for patches generated by GNU diff, but a 
lot of other systems you'll find that it means no such thing.

Look at CVS-generated patches, or SVN for that matter. The diffs look like 
this:

	Index: file
	===================================================================
	--- file (revision 0)
	+++ file (working copy)
	@@ -0,0 +1 @@
	+test

and there is no /dev/null there.

The thing is, git-apply is careful, and it's very much careful with 
respect to *knowing* that there are lots of different versions of "diff" 
floating around, and lots of different SCM systems that generate odd diff 
headers. We should absolutely NOT start expecting that diffs are only 
generated with GNU diff.

So non-/dev/null'ness means absolutely nothing. It means "don't know", and 
we should leave is_new and is_delete as -1.

		Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] builtin-apply: check for empty files when detecting creation patch
  2008-05-13 22:24       ` Linus Torvalds
@ 2008-05-13 22:34         ` Junio C Hamano
  2008-05-13 22:58           ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2008-05-13 22:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Imre Deak, git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> Look at CVS-generated patches, or SVN for that matter. The diffs look like 
> this:
>
> 	Index: file
> 	===================================================================
> 	--- file (revision 0)
> 	+++ file (working copy)
> 	@@ -0,0 +1 @@
> 	+test
>
> and there is no /dev/null there.
>
> The thing is, git-apply is careful, and it's very much careful with 
> respect to *knowing* that there are lots of different versions of "diff" 
> floating around, and lots of different SCM systems that generate odd diff 
> headers. We should absolutely NOT start expecting that diffs are only 
> generated with GNU diff.
>
> So non-/dev/null'ness means absolutely nothing. It means "don't know", and 
> we should leave is_new and is_delete as -1.

Ok, then what's the judgement for the original issue?  Is it a user error
to have a tracked absolutely empty file in the index?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] builtin-apply: check for empty files when detecting creation patch
  2008-05-13 22:34         ` Junio C Hamano
@ 2008-05-13 22:58           ` Linus Torvalds
  2008-05-14  0:13             ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2008-05-13 22:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Imre Deak, git



On Tue, 13 May 2008, Junio C Hamano wrote:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
> >
> > So non-/dev/null'ness means absolutely nothing. It means "don't know", and 
> > we should leave is_new and is_delete as -1.
> 
> Ok, then what's the judgement for the original issue?  Is it a user error
> to have a tracked absolutely empty file in the index?

I think this is the fundamental problem:

	..
	if (patch->is_new < 0 && !oldlines) {
		patch->is_new = 1;
	..

because that logic simply isn't right. (is_new < 0 && !oldlines) does 
*not* mean that it must be new.

We can say it the other way around, of course:

	if (patch->is_new < 0 && oldlines)
		patch->is_new = 0;

and that's a valid rule, but I think we already would never set "is_new" 
to -1 if we had old lines, so that would probably be a pointless thing to 
do.

So: remove the check for (is_new < 0 && !oldlines) because it doesn't 
actually add any information, and leave "is_new" as unknown until later 
when we actually *see* that file or not. Hmm?

		Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] builtin-apply: check for empty files when detecting creation patch
  2008-05-13 22:58           ` Linus Torvalds
@ 2008-05-14  0:13             ` Junio C Hamano
  2008-05-14  1:14               ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2008-05-14  0:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Imre Deak, git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> I think this is the fundamental problem:
>
> 	..
> 	if (patch->is_new < 0 && !oldlines) {
> 		patch->is_new = 1;
> 	..
>
> because that logic simply isn't right. (is_new < 0 && !oldlines) does 
> *not* mean that it must be new.
>
> We can say it the other way around, of course:
>
> 	if (patch->is_new < 0 && oldlines)
> 		patch->is_new = 0;
>
> and that's a valid rule, but I think we already would never set "is_new" 
> to -1 if we had old lines, so that would probably be a pointless thing to 
> do.

Yeah, in fact we have this:

	if (patch->is_new < 0 &&
	    (oldlines || (patch->fragments && patch->fragments->next)))
		patch->is_new = 0;

a several lines above the piece you quoted in parse_single_patch().


> So: remove the check for (is_new < 0 && !oldlines) because it doesn't 
> actually add any information, and leave "is_new" as unknown until later 
> when we actually *see* that file or not. Hmm?

That leads to the similar logic to remove "If we do not know if it is
delete or not, not adding any lines (or any replacement lines) means
delete" logic as well, which in turn means the whole if (!unidiff_zero...)
block would go.

Which actually is a good thing, I think.  As Imre mentioned, I do not
think 

	if (!unidiff_zero || context)

makes sense.  I cannot guess what the intention of that was.

    commit 4be609625e48e908f2b76d35bfeb61a8ba3a83a0
    Author: Junio C Hamano <junkio@cox.net>
    Date:   Sun Sep 17 01:04:24 2006 -0700

        apply --unidiff-zero: loosen sanity checks for --unidiff=0 patches

        In "git-apply", we have a few sanity checks and heuristics that
        expects that the patch fed to us is a unified diff with at least
        one line of context.

         ...description of good tests)...

        These sanity checks are good safety measures, but breaks down
        when people feed a diff generated with --unified=0.  This was
        recently noticed first by Matthew Wilcox and Gerrit Pape.

        This adds a new flag, --unified-zero, to allow bypassing these
        checks.  If you are in control of the patch generation process,
        you should not use --unified=0 patch and fix it up with this
        flag; rather you should try work with a patch with context.  But
        if all you have to work with is a patch without context, this
        flag may come handy as the last resort.

        Signed-off-by: Junio C Hamano <junkio@cox.net>

So unless unidiff-zero is given, we would want extra checks inside that
block.  Also if we have context in the patch anywhere, that means the
patch cannot be coming from "diff -u0", so we go into that block.

Even though other checks that are guarded with "if (!unidiff_zero)"
elsewhere in the code do make sense, however, this block may not do
anything useful, as you pointed out.

With the change to remove the whole block, all tests still passes, and a
limited test with this:

        --- empty	2008-05-13 16:56:57.000000000 -0700
        +++ empty.1	2008-05-13 16:57:07.000000000 -0700
        @@ -0,0 +1 @@
        +foo

to update an originally empty file "empty" also seems to work.

However, with this change, it no longer allows you to accept such a patch
and treat it as a creation of "empty".  Instead we barf with "error:
empty: No such file or directory", if you do not have an empty "empty"
file in the work tree when you run "git apply" on the above patch.

When "diff" was run with flags that could produce context (that is what we
get from !unidiff_zero), if we do not see any lines that begin with '-',
the only case that it is not a creation patch is if you compared the
postimage with a preimage of size 0.  Because we do not have usable
/dev/null cue, we cannot tell that case from a creation patch.

I am having a strong suspicion that doing this change is robbing Peter to
pay Imre.  We simply cannot have it both ways, methinks.

---

 builtin-apply.c |   17 +----------------
 1 files changed, 1 insertions(+), 16 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index 1103625..395f16b 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -418,7 +418,7 @@ static int guess_p_value(const char *nameline)
 }
 
 /*
- * Get the name etc info from the --/+++ lines of a traditional patch header
+ * Get the name etc info from the ---/+++ lines of a traditional patch header
  *
  * FIXME! The end-of-filename heuristics are kind of screwy. For existing
  * files, we can happily check the index for a match, but for creating a
@@ -1143,21 +1143,6 @@ static int parse_single_patch(char *line, unsigned long size, struct patch *patc
 	if (patch->is_delete < 0 &&
 	    (newlines || (patch->fragments && patch->fragments->next)))
 		patch->is_delete = 0;
-	if (!unidiff_zero || context) {
-		/* If the user says the patch is not generated with
-		 * --unified=0, or if we have seen context lines,
-		 * then not having oldlines means the patch is creation,
-		 * and not having newlines means the patch is deletion.
-		 */
-		if (patch->is_new < 0 && !oldlines) {
-			patch->is_new = 1;
-			patch->old_name = NULL;
-		}
-		if (patch->is_delete < 0 && !newlines) {
-			patch->is_delete = 1;
-			patch->new_name = NULL;
-		}
-	}
 
 	if (0 < patch->is_new && oldlines)
 		die("new file %s depends on old contents", patch->new_name);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] builtin-apply: check for empty files when detecting creation patch
  2008-05-14  0:13             ` Junio C Hamano
@ 2008-05-14  1:14               ` Linus Torvalds
  2008-05-17  9:12                 ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2008-05-14  1:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Imre Deak, git



On Tue, 13 May 2008, Junio C Hamano wrote:
> 
> With the change to remove the whole block, all tests still passes, and a
> limited test with this:
> 
>         --- empty	2008-05-13 16:56:57.000000000 -0700
>         +++ empty.1	2008-05-13 16:57:07.000000000 -0700
>         @@ -0,0 +1 @@
>         +foo
> 
> to update an originally empty file "empty" also seems to work.
> 
> However, with this change, it no longer allows you to accept such a patch
> and treat it as a creation of "empty".  Instead we barf with "error:
> empty: No such file or directory", if you do not have an empty "empty"
> file in the work tree when you run "git apply" on the above patch.

Ok, that's a bug. It should *not* require that existing empty file, since 
"is_new" is -1. That's what -1 means: we don't know if it is new or not.

So I think your patch is correct, but we need to fix the thing that barfs 
to not barf if we don't know the status of "is_new"

		Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] builtin-apply: check for empty files when detecting creation patch
  2008-05-14  1:14               ` Linus Torvalds
@ 2008-05-17  9:12                 ` Junio C Hamano
  2008-05-17  9:18                   ` [PATCH 1/2] builtin-apply: accept patch to an empty file Junio C Hamano
  2008-05-17  9:19                   ` [PATCH 2/2] builtin-apply: do not declare patch is creation when we do not know it Junio C Hamano
  0 siblings, 2 replies; 12+ messages in thread
From: Junio C Hamano @ 2008-05-17  9:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Imre Deak, git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Tue, 13 May 2008, Junio C Hamano wrote:
>> 
>> With the change to remove the whole block, all tests still passes, and a
>> limited test with this:
>> 
>>         --- empty	2008-05-13 16:56:57.000000000 -0700
>>         +++ empty.1	2008-05-13 16:57:07.000000000 -0700
>>         @@ -0,0 +1 @@
>>         +foo
>> 
>> to update an originally empty file "empty" also seems to work.
>> 
>> However, with this change, it no longer allows you to accept such a patch
>> and treat it as a creation of "empty".  Instead we barf with "error:
>> empty: No such file or directory", if you do not have an empty "empty"
>> file in the work tree when you run "git apply" on the above patch.
>
> Ok, that's a bug. It should *not* require that existing empty file, since 
> "is_new" is -1. That's what -1 means: we don't know if it is new or not.
>
> So I think your patch is correct, but we need to fix the thing that barfs 
> to not barf if we don't know the status of "is_new"

Sorry for taking some time to follow this through (I've been busy with day
job).  Two patches follow this message to address this issue.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] builtin-apply: accept patch to an empty file
  2008-05-17  9:12                 ` Junio C Hamano
@ 2008-05-17  9:18                   ` Junio C Hamano
  2008-05-17  9:19                   ` [PATCH 2/2] builtin-apply: do not declare patch is creation when we do not know it Junio C Hamano
  1 sibling, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2008-05-17  9:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Imre Deak, git

A patch from a foreign SCM (or plain "diff" output) often have both
preimage and postimage filename on ---/+++ lines even for a patch that
creates a new file.  However, when there is a filename for preimage, we
used to insist the file to exist (either in the work tree and/or in the
index).  When we cannot be sure by parsing the patch that it is not a
creation patch, we shouldn't complain when if there is no such a file.
This commit fixes the logic.

Refactor the code that validates the preimage file into a separate
function while we are at it, as it is getting rather big.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c |  133 ++++++++++++++++++++++++++++++++-----------------------
 1 files changed, 77 insertions(+), 56 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index 776e596..10b1f88 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -2267,16 +2267,11 @@ static int verify_index_match(struct cache_entry *ce, struct stat *st)
 	return ce_match_stat(ce, st, CE_MATCH_IGNORE_VALID);
 }
 
-static int check_patch(struct patch *patch, struct patch *prev_patch)
+static int check_preimage(struct patch *patch, struct cache_entry **ce, struct stat *st)
 {
-	struct stat st;
 	const char *old_name = patch->old_name;
-	const char *new_name = patch->new_name;
-	const char *name = old_name ? old_name : new_name;
-	struct cache_entry *ce = NULL;
-	int ok_if_exists;
-
-	patch->rejected = 1; /* we will drop this after we succeed */
+	int stat_ret = 0;
+	unsigned st_mode = 0;
 
 	/*
 	 * Make sure that we do not have local modifications from the
@@ -2284,58 +2279,84 @@ static int check_patch(struct patch *patch, struct patch *prev_patch)
 	 * we have the preimage file to be patched in the work tree,
 	 * unless --cached, which tells git to apply only in the index.
 	 */
-	if (old_name) {
-		int stat_ret = 0;
-		unsigned st_mode = 0;
-
-		if (!cached)
-			stat_ret = lstat(old_name, &st);
-		if (check_index) {
-			int pos = cache_name_pos(old_name, strlen(old_name));
-			if (pos < 0)
-				return error("%s: does not exist in index",
-					     old_name);
-			ce = active_cache[pos];
-			if (stat_ret < 0) {
-				struct checkout costate;
-				if (errno != ENOENT)
-					return error("%s: %s", old_name,
-						     strerror(errno));
-				/* checkout */
-				costate.base_dir = "";
-				costate.base_dir_len = 0;
-				costate.force = 0;
-				costate.quiet = 0;
-				costate.not_new = 0;
-				costate.refresh_cache = 1;
-				if (checkout_entry(ce,
-						   &costate,
-						   NULL) ||
-				    lstat(old_name, &st))
-					return -1;
-			}
-			if (!cached && verify_index_match(ce, &st))
-				return error("%s: does not match index",
-					     old_name);
-			if (cached)
-				st_mode = ce->ce_mode;
-		} else if (stat_ret < 0)
-			return error("%s: %s", old_name, strerror(errno));
-
-		if (!cached)
-			st_mode = ce_mode_from_stat(ce, st.st_mode);
+	if (!old_name)
+		return 0;
 
+	assert(patch->is_new <= 0);
+	if (!cached) {
+		stat_ret = lstat(old_name, st);
+		if (stat_ret && errno != ENOENT)
+			return error("%s: %s", old_name, strerror(errno));
+	}
+	if (check_index) {
+		int pos = cache_name_pos(old_name, strlen(old_name));
+		if (pos < 0) {
+			if (patch->is_new < 0)
+				goto is_new;
+			return error("%s: does not exist in index", old_name);
+		}
+		*ce = active_cache[pos];
+		if (stat_ret < 0) {
+			struct checkout costate;
+			/* checkout */
+			costate.base_dir = "";
+			costate.base_dir_len = 0;
+			costate.force = 0;
+			costate.quiet = 0;
+			costate.not_new = 0;
+			costate.refresh_cache = 1;
+			if (checkout_entry(*ce, &costate, NULL) ||
+			    lstat(old_name, st))
+				return -1;
+		}
+		if (!cached && verify_index_match(*ce, st))
+			return error("%s: does not match index", old_name);
+		if (cached)
+			st_mode = (*ce)->ce_mode;
+	} else if (stat_ret < 0) {
 		if (patch->is_new < 0)
-			patch->is_new = 0;
-		if (!patch->old_mode)
-			patch->old_mode = st_mode;
-		if ((st_mode ^ patch->old_mode) & S_IFMT)
-			return error("%s: wrong type", old_name);
-		if (st_mode != patch->old_mode)
-			fprintf(stderr, "warning: %s has type %o, expected %o\n",
-				old_name, st_mode, patch->old_mode);
+			goto is_new;
+		return error("%s: %s", old_name, strerror(errno));
 	}
 
+	if (!cached)
+		st_mode = ce_mode_from_stat(*ce, st->st_mode);
+
+	if (patch->is_new < 0)
+		patch->is_new = 0;
+	if (!patch->old_mode)
+		patch->old_mode = st_mode;
+	if ((st_mode ^ patch->old_mode) & S_IFMT)
+		return error("%s: wrong type", old_name);
+	if (st_mode != patch->old_mode)
+		fprintf(stderr, "warning: %s has type %o, expected %o\n",
+			old_name, st_mode, patch->old_mode);
+	return 0;
+
+ is_new:
+	patch->is_new = 1;
+	patch->is_delete = 0;
+	patch->old_name = NULL;
+	return 0;
+}
+
+static int check_patch(struct patch *patch, struct patch *prev_patch)
+{
+	struct stat st;
+	const char *old_name = patch->old_name;
+	const char *new_name = patch->new_name;
+	const char *name = old_name ? old_name : new_name;
+	struct cache_entry *ce = NULL;
+	int ok_if_exists;
+	int status;
+
+	patch->rejected = 1; /* we will drop this after we succeed */
+
+	status = check_preimage(patch, &ce, &st);
+	if (status)
+		return status;
+	old_name = patch->old_name;
+
 	if (new_name && prev_patch && 0 < prev_patch->is_delete &&
 	    !strcmp(prev_patch->old_name, new_name))
 		/*
-- 
1.5.5.1.443.g123e3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] builtin-apply: do not declare patch is creation when we do not know it
  2008-05-17  9:12                 ` Junio C Hamano
  2008-05-17  9:18                   ` [PATCH 1/2] builtin-apply: accept patch to an empty file Junio C Hamano
@ 2008-05-17  9:19                   ` Junio C Hamano
  1 sibling, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2008-05-17  9:19 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Imre Deak, git

When we see no context nor deleted line in the patch, we used to declare
that the patch creates a new file.  But some people create an empty file
and then apply a patch to it.  Similarly, a patch that delete everything
is not a deletion patch either.

This commit corrects these two issues.  Together with the previous commit,
it allows a diff between an empty file and a line-ful file to be treated
as both creation patch and "add stuff to an existing empty file",
depending on the context.  A new test t4126 demonstrates the fix.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-apply.c        |   15 ------------
 t/t4126-apply-empty.sh |   60 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+), 15 deletions(-)
 create mode 100755 t/t4126-apply-empty.sh

diff --git a/builtin-apply.c b/builtin-apply.c
index 10b1f88..1540f28 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1143,21 +1143,6 @@ static int parse_single_patch(char *line, unsigned long size, struct patch *patc
 	if (patch->is_delete < 0 &&
 	    (newlines || (patch->fragments && patch->fragments->next)))
 		patch->is_delete = 0;
-	if (!unidiff_zero || context) {
-		/* If the user says the patch is not generated with
-		 * --unified=0, or if we have seen context lines,
-		 * then not having oldlines means the patch is creation,
-		 * and not having newlines means the patch is deletion.
-		 */
-		if (patch->is_new < 0 && !oldlines) {
-			patch->is_new = 1;
-			patch->old_name = NULL;
-		}
-		if (patch->is_delete < 0 && !newlines) {
-			patch->is_delete = 1;
-			patch->new_name = NULL;
-		}
-	}
 
 	if (0 < patch->is_new && oldlines)
 		die("new file %s depends on old contents", patch->new_name);
diff --git a/t/t4126-apply-empty.sh b/t/t4126-apply-empty.sh
new file mode 100755
index 0000000..6641571
--- /dev/null
+++ b/t/t4126-apply-empty.sh
@@ -0,0 +1,60 @@
+#!/bin/sh
+
+test_description='apply empty'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	>empty &&
+	git add empty &&
+	test_tick &&
+	git commit -m initial &&
+	for i in a b c d e
+	do
+		echo $i
+	done >empty &&
+	cat empty >expect &&
+	git diff |
+	sed -e "/^diff --git/d" \
+	    -e "/^index /d" \
+	    -e "s|a/empty|empty.orig|" \
+	    -e "s|b/empty|empty|" >patch0 &&
+	sed -e "s|empty|missing|" patch0 >patch1 &&
+	>empty &&
+	git update-index --refresh
+'
+
+test_expect_success 'apply empty' '
+	git reset --hard &&
+	>empty &&
+	rm -f missing &&
+	git apply patch0 &&
+	test_cmp expect empty
+'
+
+test_expect_success 'apply --index empty' '
+	git reset --hard &&
+	>empty &&
+	rm -f missing &&
+	git apply --index patch0 &&
+	test_cmp expect empty &&
+	git diff --exit-code
+'
+
+test_expect_success 'apply create' '
+	git reset --hard &&
+	>empty &&
+	rm -f missing &&
+	git apply patch1 &&
+	test_cmp expect missing
+'
+
+test_expect_success 'apply --index create' '
+	git reset --hard &&
+	>empty &&
+	rm -f missing &&
+	git apply --index patch1 &&
+	test_cmp expect missing &&
+	git diff --exit-code
+'
+
-- 
1.5.5.1.443.g123e3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-05-17  9:20 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-08 14:39 [PATCH] builtin-apply: check for empty files when detecting creation patch Imre Deak
2008-05-11  2:36 ` Junio C Hamano
2008-05-13 20:16   ` Imre Deak
2008-05-13 21:48     ` Junio C Hamano
2008-05-13 22:24       ` Linus Torvalds
2008-05-13 22:34         ` Junio C Hamano
2008-05-13 22:58           ` Linus Torvalds
2008-05-14  0:13             ` Junio C Hamano
2008-05-14  1:14               ` Linus Torvalds
2008-05-17  9:12                 ` Junio C Hamano
2008-05-17  9:18                   ` [PATCH 1/2] builtin-apply: accept patch to an empty file Junio C Hamano
2008-05-17  9:19                   ` [PATCH 2/2] builtin-apply: do not declare patch is creation when we do not know it Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).