git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Expose subprojects as special files to "git diff" machinery
@ 2007-04-15 18:14 Linus Torvalds
  2007-04-15 19:01 ` Sam Ravnborg
  2007-04-15 20:16 ` Andy Parkins
  0 siblings, 2 replies; 12+ messages in thread
From: Linus Torvalds @ 2007-04-15 18:14 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List


The same way we generate diffs on symlinks as the the diff of text of the 
symlink, we can generate subproject diffs (when not recursing into them!) 
as the diff of the text that describes the subproject.

Of course, since what descibes a subproject is just the SHA1, that's what 
we'll use. Add some pretty-printing to make it a bit more obvious what is 
going on, and we're done.

So with this, we can get both raw diffs and "textual" diffs of subproject 
changes:

 - git diff --raw:

	:160000 160000 2de597b5ad348b7db04bd10cdd38cd81cbc93ab5 0000000... M    sub-A


 - git diff:

	diff --git a/sub-A b/sub-A
	index 2de597b..e8f11a4 160000
	--- a/sub-A
	+++ b/sub-A
	@@ -1 +1 @@
	-Subproject commit 2de597b5ad348b7db04bd10cdd38cd81cbc93ab5
	+Subproject commit e8f11a45c5c6b9e2fec6d136d3fb5aff75393d42

NOTE! We'll also want to have the ability to recurse into the subproject 
and actually diff it recursively, but that will involve a new command line 
option (I'd suggest "--subproject" and "-S", but the latter is in use by 
pickaxe), and some very different code.

But regardless of ay future recursive behaviour, we need the non-recursive 
version too (and it should be the default, at least in the absense of 
config options, so that large superprojects don't default to something 
extremely expensive).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 diff.c |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/diff.c b/diff.c
index e4efb65..137f5b4 100644
--- a/diff.c
+++ b/diff.c
@@ -1419,6 +1419,21 @@ static int populate_from_stdin(struct diff_filespec *s)
 	return 0;
 }
 
+static int diff_populate_gitlink(struct diff_filespec *s, int size_only)
+{
+	int len;
+	char *data = xmalloc(100);
+	len = snprintf(data, 100,
+		"Subproject commit %s\n", sha1_to_hex(s->sha1));
+	s->data = data;
+	s->size = len;
+	if (size_only) {
+		s->data = NULL;
+		free(data);
+	}
+	return 0;
+}
+
 /*
  * While doing rename detection and pickaxe operation, we may need to
  * grab the data for the blob (or file) for our own in-core comparison.
@@ -1437,6 +1452,10 @@ int diff_populate_filespec(struct diff_filespec *s, int size_only)
 
 	if (s->data)
 		return err;
+
+	if (S_ISDIRLNK(s->mode))
+		return diff_populate_gitlink(s, size_only);
+
 	if (!s->sha1_valid ||
 	    reuse_worktree_file(s->path, s->sha1, 0)) {
 		struct stat st;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: Expose subprojects as special files to "git diff" machinery
  2007-04-15 18:14 Expose subprojects as special files to "git diff" machinery Linus Torvalds
@ 2007-04-15 19:01 ` Sam Ravnborg
  2007-04-15 19:05   ` Linus Torvalds
  2007-04-15 20:16 ` Andy Parkins
  1 sibling, 1 reply; 12+ messages in thread
From: Sam Ravnborg @ 2007-04-15 19:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

>  
> +static int diff_populate_gitlink(struct diff_filespec *s, int size_only)
> +{
> +	int len;
> +	char *data = xmalloc(100);
> +	len = snprintf(data, 100,
> +		"Subproject commit %s\n", sha1_to_hex(s->sha1));

In userland I would use a local variable for an array of the size of 100.
I would normally only allocate when we are say 5x bigger.

I wonder if there is a specific reason why you decided upon xmalloc here?
I see no problem in using xmalloc but wonder if there is something I
should start to do differently??

	Sam

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Expose subprojects as special files to "git diff" machinery
  2007-04-15 19:01 ` Sam Ravnborg
@ 2007-04-15 19:05   ` Linus Torvalds
  2007-04-15 19:20     ` Sam Ravnborg
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2007-04-15 19:05 UTC (permalink / raw)
  To: Sam Ravnborg; +Cc: Junio C Hamano, Git Mailing List



On Sun, 15 Apr 2007, Sam Ravnborg wrote:

> >  
> > +static int diff_populate_gitlink(struct diff_filespec *s, int size_only)
> > +{
> > +	int len;
> > +	char *data = xmalloc(100);
> > +	len = snprintf(data, 100,
> > +		"Subproject commit %s\n", sha1_to_hex(s->sha1));
> 
> In userland I would use a local variable for an array of the size of 100.
> I would normally only allocate when we are say 5x bigger.

We're _returning_ the pointer to the caller, so no, we cannot use an 
automatic array.

But I do think I had a bug - I think I should have set

	s->should_free = 1;

to let the caller know it should be free'd with free(), and not leak the 
thing.

		Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Expose subprojects as special files to "git diff" machinery
  2007-04-15 19:05   ` Linus Torvalds
@ 2007-04-15 19:20     ` Sam Ravnborg
  0 siblings, 0 replies; 12+ messages in thread
From: Sam Ravnborg @ 2007-04-15 19:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List

On Sun, Apr 15, 2007 at 12:05:44PM -0700, Linus Torvalds wrote:
> 
> 
> On Sun, 15 Apr 2007, Sam Ravnborg wrote:
> 
> > >  
> > > +static int diff_populate_gitlink(struct diff_filespec *s, int size_only)
> > > +{
> > > +	int len;
> > > +	char *data = xmalloc(100);
> > > +	len = snprintf(data, 100,
> > > +		"Subproject commit %s\n", sha1_to_hex(s->sha1));
> > 
> > In userland I would use a local variable for an array of the size of 100.
> > I would normally only allocate when we are say 5x bigger.
> 
> We're _returning_ the pointer to the caller, so no, we cannot use an 
> automatic array.
Do not know how I missed that...
Thanks,
	Sam

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Expose subprojects as special files to "git diff" machinery
  2007-04-15 18:14 Expose subprojects as special files to "git diff" machinery Linus Torvalds
  2007-04-15 19:01 ` Sam Ravnborg
@ 2007-04-15 20:16 ` Andy Parkins
  2007-04-15 21:03   ` Linus Torvalds
  1 sibling, 1 reply; 12+ messages in thread
From: Andy Parkins @ 2007-04-15 20:16 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds, Junio C Hamano

On Sunday 2007, April 15, Linus Torvalds wrote:

> 	diff --git a/sub-A b/sub-A
> 	index 2de597b..e8f11a4 160000
> 	--- a/sub-A
> 	+++ b/sub-A
> 	@@ -1 +1 @@
> 	-Subproject commit 2de597b5ad348b7db04bd10cdd38cd81cbc93ab5
> 	+Subproject commit e8f11a45c5c6b9e2fec6d136d3fb5aff75393d42

Isn't this dangerous because it looks just like a normal diff with a 
file being rewritten, when in truth it is a tree entry record being 
rewritten.  Possibly this is in the same category as the enhancements 
needed for rename support

cf.

diff --git a/file1 b/file2
similarity index 100%
rename from file1
rename to file2

There's no need to have a line marker, as there is only one "line" in 
the subproject "file", and no need for the index line because that 
information is contained in the change itself.  So something like:

diff --git a/sub-A b/sub-B
subproject from commit 2de597b5ad348b7db04bd10cdd38cd81cbc93ab5
subproject to commit e8f11a45c5c6b9e2fec6d136d3fb5aff75393d42

Regardless of the format, I'm sure this should be treated as part of 
git's extended diff syntax, so that if a diff like this ever got into 
the wild it wouldn't be a disaster if someone tried to apply it.



Andy
-- 
Dr Andy Parkins, M Eng (hons), MIET
andyparkins@gmail.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Expose subprojects as special files to "git diff" machinery
  2007-04-15 20:16 ` Andy Parkins
@ 2007-04-15 21:03   ` Linus Torvalds
  2007-04-18  8:46     ` git-format-patch for binary files / merges [Re: Expose subprojects as special files to "git diff" machinery] Sam Vilain
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2007-04-15 21:03 UTC (permalink / raw)
  To: Andy Parkins; +Cc: git, Junio C Hamano



On Sun, 15 Apr 2007, Andy Parkins wrote:
> > 	+++ b/sub-A
> > 	@@ -1 +1 @@
> > 	-Subproject commit 2de597b5ad348b7db04bd10cdd38cd81cbc93ab5
> > 	+Subproject commit e8f11a45c5c6b9e2fec6d136d3fb5aff75393d42
> 
> Isn't this dangerous because it looks just like a normal diff with a 
> file being rewritten, when in truth it is a tree entry record being 
> rewritten.

Well, that's exactly what symlinks do too.

You have to look at the mode to know what the rewriting *means*.

But yeah, I wouldn't object at all to making it an "extended git header" 
instead (possibly just for subprojects, possibly for symlinks too)

		Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* git-format-patch for binary files / merges [Re: Expose subprojects as special files to "git diff" machinery]
  2007-04-15 21:03   ` Linus Torvalds
@ 2007-04-18  8:46     ` Sam Vilain
  2007-04-18 11:49       ` Alex Riesen
                         ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Sam Vilain @ 2007-04-18  8:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andy Parkins, git, Junio C Hamano

Linus Torvalds wrote:
>> Isn't this dangerous because it looks just like a normal diff with a 
>> file being rewritten, when in truth it is a tree entry record being 
>> rewritten.
>>     
> Well, that's exactly what symlinks do too.
>
> You have to look at the mode to know what the rewriting *means*.
>
> But yeah, I wouldn't object at all to making it an "extended git header" 
> instead (possibly just for subprojects, possibly for symlinks too)
>   

Speaking of 'custom' patch file formats, anyone put any thought to a
format for the commits which can't be represented with patch, like
binary files and merges?

Sam.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-format-patch for binary files / merges [Re: Expose subprojects as special files to "git diff" machinery]
  2007-04-18  8:46     ` git-format-patch for binary files / merges [Re: Expose subprojects as special files to "git diff" machinery] Sam Vilain
@ 2007-04-18 11:49       ` Alex Riesen
  2007-04-18 11:52       ` Johannes Schindelin
  2007-04-18 15:47       ` Linus Torvalds
  2 siblings, 0 replies; 12+ messages in thread
From: Alex Riesen @ 2007-04-18 11:49 UTC (permalink / raw)
  To: Sam Vilain; +Cc: Linus Torvalds, Andy Parkins, git, Junio C Hamano

On 4/18/07, Sam Vilain <sam@vilain.net> wrote:
> Linus Torvalds wrote:
> >> Isn't this dangerous because it looks just like a normal diff with a
> >> file being rewritten, when in truth it is a tree entry record being
> >> rewritten.
> >>
> > Well, that's exactly what symlinks do too.
> >
> > You have to look at the mode to know what the rewriting *means*.
> >
> > But yeah, I wouldn't object at all to making it an "extended git header"
> > instead (possibly just for subprojects, possibly for symlinks too)
> >
>
> Speaking of 'custom' patch file formats, anyone put any thought to a
> format for the commits which can't be represented with patch, like
> binary files and merges?
>

There is binary diff (look for --binary in Documentation/).
And regarding merges: have you looked at git-bundle yet?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-format-patch for binary files / merges [Re: Expose subprojects as special files to "git diff" machinery]
  2007-04-18  8:46     ` git-format-patch for binary files / merges [Re: Expose subprojects as special files to "git diff" machinery] Sam Vilain
  2007-04-18 11:49       ` Alex Riesen
@ 2007-04-18 11:52       ` Johannes Schindelin
  2007-04-18 15:47       ` Linus Torvalds
  2 siblings, 0 replies; 12+ messages in thread
From: Johannes Schindelin @ 2007-04-18 11:52 UTC (permalink / raw)
  To: Sam Vilain; +Cc: Linus Torvalds, Andy Parkins, git, Junio C Hamano

Hi,

On Wed, 18 Apr 2007, Sam Vilain wrote:

> Speaking of 'custom' patch file formats, anyone put any thought to a 
> format for the commits which can't be represented with patch, like 
> binary files and merges?

Git has support for binary patches. They are base85 encoded, but you have 
to enable them explicitely (like renames) with --binary.

As for merges, there is the combined diff format. It has more than one 
cell at the beginning to indicate more than one "new" file (we do not use 
this format for merges, but rather for conflicts, but you could easily 
invert that meaning).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-format-patch for binary files / merges [Re: Expose subprojects as special files to "git diff" machinery]
  2007-04-18  8:46     ` git-format-patch for binary files / merges [Re: Expose subprojects as special files to "git diff" machinery] Sam Vilain
  2007-04-18 11:49       ` Alex Riesen
  2007-04-18 11:52       ` Johannes Schindelin
@ 2007-04-18 15:47       ` Linus Torvalds
  2007-04-18 16:08         ` Junio C Hamano
  2007-04-19  8:28         ` Johannes Schindelin
  2 siblings, 2 replies; 12+ messages in thread
From: Linus Torvalds @ 2007-04-18 15:47 UTC (permalink / raw)
  To: Sam Vilain; +Cc: Andy Parkins, git, Junio C Hamano



On Wed, 18 Apr 2007, Sam Vilain wrote:
> 
> Speaking of 'custom' patch file formats, anyone put any thought to a
> format for the commits which can't be represented with patch, like
> binary files and merges?

We do already support binary patches, so those can be represented well in 
a patch (but you need "git-apply" to apply them, so they are disabled by 
default.. And while we have tests for them, I suspect not a lot of people 
really use them widely, so who knows how complete the coverage is. For 
example, will "git rebase" really work? Probably. Do I know for sure? No).

As to merges, they are certainly something that *can* be represented as a 
patch (just tell what all the parents were, and desribe the end result as 
a patch against _one_ of them: that already really encodes all the 
information). So we *could* support it with some trivial extension (and I 
don't think the extension is necessarily even an expansion of the diff 
format itself - I think the parents information is a "higher-level" 
information, and independent of the actual patch).

However, representing merges as a patch doesn't really make much sense. 
You really have two separate cases, and neither of them really makes sense 
to have a patch with:

 - you want to "rebase" the history, including merges. Quite frankly, I 
   don't think a patch makes sense: you'd really need to re-do the merge 
   with the old parents "remapped" to the new rebased parents, and what 
   you might want to have is the "rerere" information from the previous 
   merge. Maybe that's a patch, maybe it's not, but regardless, I don't
   think it really makes sense to describe it as a "patch" even if the 
   implementation would do that: you don't do "remote rebases", so 
   whatever it is would really be totally an internal thing to git-rebase.

   We currently simply don't support rebasing history with merges. Rebase 
   is a linear history operation. If somebody wants to try to rebase 
   complex history, I'd applaud the effort, but I think the problems are 
   elsewhere than the actual "patch" part. 

 - you want to send raw git commits to somebody else. Use the native git 
   protocol. We already support that over email, it's called "bundles".

   I used "bundles" once or twice under BK (it was called "bk send" and 
   "bk receive + bk resolve" or something like that), and I hated them. It 
   wasn't BK's fault: I just found the workflow annoying. So I haven't 
   even tested the git bundles, but if what you were looking for was to do 
   a "git push/pull" by email, they are what you'd be using.

(The reason I don't like bundles is that if I use email, I want to see the 
individual patches _in_ the email. And once I see them, I think it's 
better to take advantage of the flexibility that individual patches gives 
me, ie I can re-order them, decide to skip one, edit the messages or 
whatever before applying them. If I were to use bundles, I might as well 
just not blow up my mailbox, and just ask people to tell me where to 
pull from).

			Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-format-patch for binary files / merges [Re: Expose subprojects as special files to "git diff" machinery]
  2007-04-18 15:47       ` Linus Torvalds
@ 2007-04-18 16:08         ` Junio C Hamano
  2007-04-19  8:28         ` Johannes Schindelin
  1 sibling, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2007-04-18 16:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sam Vilain, Andy Parkins, git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> We do already support binary patches, so those can be represented well in 
> a patch (but you need "git-apply" to apply them, so they are disabled by 
> default.. And while we have tests for them, I suspect not a lot of people 
> really use them widely, so who knows how complete the coverage is. For 
> example, will "git rebase" really work? Probably. Do I know for sure? No).

I've seen it in action for at least a few times in my day-job,
so it appears to work ;-)

>    I used "bundles" once or twice under BK (it was called "bk send" and 
>    "bk receive + bk resolve" or something like that), and I hated them. It 
>    wasn't BK's fault: I just found the workflow annoying. So I haven't 
>    even tested the git bundles, but if what you were looking for was to do 
>    a "git push/pull" by email, they are what you'd be using.

I've send a bundle over e-mail exactly once; it appears to work,
too.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-format-patch for binary files / merges [Re: Expose subprojects as special files to "git diff" machinery]
  2007-04-18 15:47       ` Linus Torvalds
  2007-04-18 16:08         ` Junio C Hamano
@ 2007-04-19  8:28         ` Johannes Schindelin
  1 sibling, 0 replies; 12+ messages in thread
From: Johannes Schindelin @ 2007-04-19  8:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sam Vilain, Andy Parkins, git, Junio C Hamano

Hi,

On Wed, 18 Apr 2007, Linus Torvalds wrote:

> As to merges, they are certainly something that *can* be represented as a 
> patch [...]

> However, representing merges as a patch doesn't really make much sense.

Not if you want to apply it. But if you want to review it. That's why we 
have the combined diff format, which can be generated by diff, but not 
applied.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2007-04-19  8:28 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-15 18:14 Expose subprojects as special files to "git diff" machinery Linus Torvalds
2007-04-15 19:01 ` Sam Ravnborg
2007-04-15 19:05   ` Linus Torvalds
2007-04-15 19:20     ` Sam Ravnborg
2007-04-15 20:16 ` Andy Parkins
2007-04-15 21:03   ` Linus Torvalds
2007-04-18  8:46     ` git-format-patch for binary files / merges [Re: Expose subprojects as special files to "git diff" machinery] Sam Vilain
2007-04-18 11:49       ` Alex Riesen
2007-04-18 11:52       ` Johannes Schindelin
2007-04-18 15:47       ` Linus Torvalds
2007-04-18 16:08         ` Junio C Hamano
2007-04-19  8:28         ` Johannes Schindelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).