* Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
@ 2007-07-22 18:48 Jon Smirl
2007-07-22 19:00 ` Johannes Schindelin
` (2 more replies)
0 siblings, 3 replies; 24+ messages in thread
From: Jon Smirl @ 2007-07-22 18:48 UTC (permalink / raw)
To: Git Mailing List, lkml
It would really be useful if git diff had an option for suppressing
diffs caused by CVS keyword expansion. I run into this problem over
and over when trying to recover stuff out of old kernel sources that
people checked into CVS and then posted CVS diffs to fulfill their GPL
obligations. I sometimes wonder if vendors are doing this on purpose
to make it more difficult to recover the changes they made to the
code.
To prevent this in the future, I'd love to see a patch removing all
CVS keywords from the kernel sources. Quick grep of kernel shows 1,535
$Id, 197 $Log, 441 $Revision, 144 $Date. I guestimate that this would
remove about 5,000 lines form the kernel source and touch 1,700 files.
Would this be accept or do those expansions contain useful info?
Example diff caused by $Log expansion:
diff --git a/arch/cris/kernel/ptrace.c b/arch/cris/kernel/ptrace.c
index b3d10fb..32e47ca 100644
--- a/arch/cris/kernel/ptrace.c
+++ b/arch/cris/kernel/ptrace.c
@@ -8,6 +8,9 @@
* Authors: Bjorn Wesen
*
* $Log: ptrace.c,v $
+ * Revision 1.2 2003/06/26 21:08:16 vangool
+ * Commit kernel changes.
+ *
* Revision 1.8 2001/11/12 18:26:21 pkj
* Fixed compiler warnings.
*
Or in this case the person edited out all of the lines in the kernel
containing $Id
diff --git a/arch/i386/boot/tools/build.c b/arch/i386/boot/tools/build.c
index 2edd0a4..792aeb1 100644
--- a/arch/i386/boot/tools/build.c
+++ b/arch/i386/boot/tools/build.c
@@ -1,5 +1,4 @@
/*
- * $Id: build.c,v 1.5 1997/05/19 12:29:58 mj Exp $
*
* Copyright (C) 1991, 1992 Linus Torvalds
* Copyright (C) 1997 Martin Mares
A variation on this where the Id was updated..
- * $Id: build.c,v 1.5 1997/05/19 12:29:58 mj Exp $
+ * $Id: build.c,v 1.5 2002/09/29 12:29:58 mj Exp $
There are a more CVS keywords that can cause expansion, probably around twenty.
When removing these with a simple grep it can get fooled and mess up.
CVS keyword expansion adds about 60,000 lines of noise to a typical
diff.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-22 18:48 Git help for kernel archeology, suppress diffs caused by CVS keyword expansion Jon Smirl
@ 2007-07-22 19:00 ` Johannes Schindelin
2007-07-22 19:10 ` Jon Smirl
2007-07-22 19:09 ` Linus Torvalds
2007-07-22 19:47 ` Jan Engelhardt
2 siblings, 1 reply; 24+ messages in thread
From: Johannes Schindelin @ 2007-07-22 19:00 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List
Hi,
[culling lkml list, since this is really a Git problem]
On Sun, 22 Jul 2007, Jon Smirl wrote:
> It would really be useful if git diff had an option for suppressing
> diffs caused by CVS keyword expansion.
Looks to me slightly a bit like an XY problem.
How about using git-filter-branch to get rid of the expansions? Or even
better, if you have access to the CVS server, use the -k option to
cvsimport?
Hth,
Dscho
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-22 18:48 Git help for kernel archeology, suppress diffs caused by CVS keyword expansion Jon Smirl
2007-07-22 19:00 ` Johannes Schindelin
@ 2007-07-22 19:09 ` Linus Torvalds
2007-07-22 19:12 ` Jon Smirl
2007-07-22 19:17 ` David Kastrup
2007-07-22 19:47 ` Jan Engelhardt
2 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2007-07-22 19:09 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List, lkml
On Sun, 22 Jul 2007, Jon Smirl wrote:
>
> It would really be useful if git diff had an option for suppressing
> diffs caused by CVS keyword expansion.
I really think it's not a "git diff" issue, but it might be a "import"
issue.
IOW, I think you'd be a *lot* better off just not importing those things
in the first place (which is what CVS does internally), or possibly
importing them as two trees (ie you'd have the "non-log" version and the
"log expansion" version, so that you can track and compare both).
Doing the thing at "diff" time is certainly possible, but this is simply
much better done as a totally independent preprocessing phase. The diff
handling is already some of the more complex parts (and very central), it
would be much simpler and efficient to not try to make that thing fancier,
and instead solve the problem at the front-end.
Linus
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-22 19:00 ` Johannes Schindelin
@ 2007-07-22 19:10 ` Jon Smirl
2007-07-22 19:15 ` Johannes Schindelin
0 siblings, 1 reply; 24+ messages in thread
From: Jon Smirl @ 2007-07-22 19:10 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Git Mailing List
On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> [culling lkml list, since this is really a Git problem]
>
> On Sun, 22 Jul 2007, Jon Smirl wrote:
>
> > It would really be useful if git diff had an option for suppressing
> > diffs caused by CVS keyword expansion.
>
> Looks to me slightly a bit like an XY problem.
>
> How about using git-filter-branch to get rid of the expansions? Or even
> better, if you have access to the CVS server, use the -k option to
> cvsimport?
No access to the server the people handing over the diffs and not
making things easy. You get diffs like this when asking companies to
turn over their changes for GPL compliance purposes. A big source is
from companies building embedded systems., Usually the diffs are
pretty worthless but every once in a while they contain very useful
nuggets (like inadvertently documenting secret hardware features). It
is hard to spot the nuggets in all of the noise.
>
> Hth,
> Dscho
>
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-22 19:09 ` Linus Torvalds
@ 2007-07-22 19:12 ` Jon Smirl
2007-07-22 19:37 ` Linus Torvalds
2007-07-22 19:17 ` David Kastrup
1 sibling, 1 reply; 24+ messages in thread
From: Jon Smirl @ 2007-07-22 19:12 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Git Mailing List, lkml
On 7/22/07, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>
> On Sun, 22 Jul 2007, Jon Smirl wrote:
> >
> > It would really be useful if git diff had an option for suppressing
> > diffs caused by CVS keyword expansion.
>
> I really think it's not a "git diff" issue, but it might be a "import"
> issue.
>
> IOW, I think you'd be a *lot* better off just not importing those things
> in the first place (which is what CVS does internally), or possibly
> importing them as two trees (ie you'd have the "non-log" version and the
> "log expansion" version, so that you can track and compare both).
>
> Doing the thing at "diff" time is certainly possible, but this is simply
> much better done as a totally independent preprocessing phase. The diff
> handling is already some of the more complex parts (and very central), it
> would be much simpler and efficient to not try to make that thing fancier,
> and instead solve the problem at the front-end.
These diffs are coming from companies doing GPL compliance without
really wanting to comply. CVS servers are not made available.
>
> Linus
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-22 19:10 ` Jon Smirl
@ 2007-07-22 19:15 ` Johannes Schindelin
2007-07-22 19:48 ` Jon Smirl
0 siblings, 1 reply; 24+ messages in thread
From: Johannes Schindelin @ 2007-07-22 19:15 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List
Hi,
On Sun, 22 Jul 2007, Jon Smirl wrote:
> On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
>
> > On Sun, 22 Jul 2007, Jon Smirl wrote:
> >
> > > It would really be useful if git diff had an option for suppressing
> > > diffs caused by CVS keyword expansion.
> >
> > Looks to me slightly a bit like an XY problem.
> >
> > How about using git-filter-branch to get rid of the expansions? Or
> > even better, if you have access to the CVS server, use the -k option
> > to cvsimport?
>
> No access to the server the people handing over the diffs and not
> making things easy.
Ah yes, that's what I feared. Darn.
But still, I think that it would be much better not to put this into Git.
We do have diff gitattributes now, so that you can roll your own diff for
specific files, but I still think that this is more a task for a
standalone perl script. Possibly being called from filter-branch to be
done with the conversion once and for all times.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-22 19:09 ` Linus Torvalds
2007-07-22 19:12 ` Jon Smirl
@ 2007-07-22 19:17 ` David Kastrup
1 sibling, 0 replies; 24+ messages in thread
From: David Kastrup @ 2007-07-22 19:17 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Jon Smirl, Git Mailing List, lkml
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Sun, 22 Jul 2007, Jon Smirl wrote:
>>
>> It would really be useful if git diff had an option for suppressing
>> diffs caused by CVS keyword expansion.
>
> I really think it's not a "git diff" issue, but it might be a "import"
> issue.
>
> IOW, I think you'd be a *lot* better off just not importing those
> things in the first place (which is what CVS does internally), or
> possibly importing them as two trees (ie you'd have the "non-log"
> version and the "log expansion" version, so that you can track and
> compare both).
One problem is that those strings more often than not are involved in
some magic computation, like placing version and date information into
extracted output. While an import of the unexpanded version is really
the sanest option, it might render the resulting code inoperable.
CVS diff itself reports those differences, too (and it makes for a
good quota of merge problems even in CVS), so it is not like we are in
bad company here.
Uh, strike that.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-22 19:12 ` Jon Smirl
@ 2007-07-22 19:37 ` Linus Torvalds
0 siblings, 0 replies; 24+ messages in thread
From: Linus Torvalds @ 2007-07-22 19:37 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List, lkml
On Sun, 22 Jul 2007, Jon Smirl wrote:
> On 7/22/07, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> >
> >
> > On Sun, 22 Jul 2007, Jon Smirl wrote:
> > >
> > > It would really be useful if git diff had an option for suppressing
> > > diffs caused by CVS keyword expansion.
> >
> > I really think it's not a "git diff" issue, but it might be a "import"
> > issue.
> >
> > IOW, I think you'd be a *lot* better off just not importing those things
> > in the first place (which is what CVS does internally), or possibly
> > importing them as two trees (ie you'd have the "non-log" version and the
> > "log expansion" version, so that you can track and compare both).
> >
> > Doing the thing at "diff" time is certainly possible, but this is simply
> > much better done as a totally independent preprocessing phase. The diff
> > handling is already some of the more complex parts (and very central), it
> > would be much simpler and efficient to not try to make that thing fancier,
> > and instead solve the problem at the front-end.
>
> These diffs are coming from companies doing GPL compliance without
> really wanting to comply. CVS servers are not made available.
That wasn't what I said.
You want to supporess the CVS keyword expansion.
I'm telling you that you should just do so.
But I'm *also* telling you that this has nothing to do with "git diff".
The way to get "git diff" to not show the CVS expansion is to *remove* the
CVS expansion. By simply running some pre-processing on the patches
*before* you put them into git in the first place (or, like Dscho
suggested: you could do it later too, by using git filter-branch).
Once you have the version that doesn't have the CVS log, "git diff" will
automatically do the rigth thing.
In other words, I'm just saying that you're trying to solve the wrong
problem. Once you solve the *right* problem, the wrong problem just goes
away.
Linus
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-22 18:48 Git help for kernel archeology, suppress diffs caused by CVS keyword expansion Jon Smirl
2007-07-22 19:00 ` Johannes Schindelin
2007-07-22 19:09 ` Linus Torvalds
@ 2007-07-22 19:47 ` Jan Engelhardt
2 siblings, 0 replies; 24+ messages in thread
From: Jan Engelhardt @ 2007-07-22 19:47 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List, lkml
On Jul 22 2007 14:48, Jon Smirl wrote:
>
> It would really be useful if git diff had an option for suppressing
> diffs caused by CVS keyword expansion. I run into this problem over
> and over when trying to recover stuff out of old kernel sources that
> people checked into CVS and then posted CVS diffs to fulfill their GPL
> obligations. I sometimes wonder if vendors are doing this on purpose
> to make it more difficult to recover the changes they made to the
> code.
>
> To prevent this in the future, I'd love to see a patch removing all
> CVS keywords from the kernel sources.
I say: yes please. Even svn (which Linus dislikes very much) has cvs
keywords disabled by default. And at the same time, a patch to
CodingStyle not to include CVS tags, and a patch to checkpatch.pl
to enforce it :)
> Quick grep of kernel shows 1,535 $Id, 197 $Log, 441 $Revision, 144 $Date. I
> guestimate that this would remove about 5,000 lines form the kernel source
> and touch 1,700 files. Would this be accept or do those expansions contain
> useful info?
Common argument for $Id$ tags has been, that upon a problem, a user
can look into dmesg (or whatever log is appropraite - it's not
limited to the kernel) for the version string instead of figuring it
out himself (by means or rpm -q, or whatever hides it).
For the Linux kernel however, I do not think it is important, since:
- the code of a module changes faster than you can think (e.g.
a change in the network code touched a network filesystem). Result:
same $Id$ tag in the vendor's version AND the mainline version.
So the tag is useless.
- you know what version you are running, mingetty usually prints it
for you; uname -a also shows it.
- if you have a problem, people kindly ask you to bisect anyway, in
which case $Id$ numbers become irrelevant anyway, because you got
the git SHA1 ids.
Yep, that's pretty much it from my side. Conclusion: Remove CVS
tags from kernel.
Jan
--
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-22 19:15 ` Johannes Schindelin
@ 2007-07-22 19:48 ` Jon Smirl
2007-07-22 21:39 ` Johannes Schindelin
0 siblings, 1 reply; 24+ messages in thread
From: Jon Smirl @ 2007-07-22 19:48 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Git Mailing List
On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> But still, I think that it would be much better not to put this into Git.
> We do have diff gitattributes now, so that you can roll your own diff for
> specific files, but I still think that this is more a task for a
> standalone perl script. Possibly being called from filter-branch to be
> done with the conversion once and for all times.
I can provide sample diffs if you want something to play with.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-22 19:48 ` Jon Smirl
@ 2007-07-22 21:39 ` Johannes Schindelin
2007-07-22 23:45 ` Jon Smirl
0 siblings, 1 reply; 24+ messages in thread
From: Johannes Schindelin @ 2007-07-22 21:39 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List
Hi,
On Sun, 22 Jul 2007, Jon Smirl wrote:
> On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > But still, I think that it would be much better not to put this into Git.
> > We do have diff gitattributes now, so that you can roll your own diff for
> > specific files, but I still think that this is more a task for a
> > standalone perl script. Possibly being called from filter-branch to be
> > done with the conversion once and for all times.
>
> I can provide sample diffs if you want something to play with.
As you already did, this is my attempt at a perl script... Feel free to
bash my Perl capabilities, or to correct it...
Ciao,
Dscho
-- snipsnap --
#!/usr/bin/perl
sub init_hunk {
$_ = $_[0];
$current_hunk = "";
$current_hunk_header = $_;
($start_minus, $dummy, $start_plus, $dummy) =
/^\@\@ -(\d+)(,\d+|) \+(\d+)(,\d+|) \@\@/;
$plus = $minus = $space = 0;
$skip_logs = 0;
}
sub flush_hunk {
if ($plus > 0 || $minus > 0) {
if ($current_file ne "") {
print $current_file;
$current_file = "";
}
$minus += $space;
$plus += $space;
print "\@\@ -$start_minus,$minus "
. "+$start_plus,$plus \@\@\n";
print $current_hunk;
}
}
sub check_file {
$_ = $_[0];
$current_file = $_;
while (<>) {
if (/^\@\@/) {
last;
}
$current_file .= $_;
}
init_hunk $_;
# check hunks
while (<>) {
if ($skip_logs && /^\+ *\*/) {
# do nothing
} elsif (/^\@\@.*/) {
flush_hunk;
init_hunk $_;
} elsif (/^diff/) {
flush_hunk;
return;
} elsif (/^-.*\$(Id|Revision|Author|Date).*\$/) {
$key = $1;
s/^-/ /;
$current_hunk .= $_;
$space++;
$_ = <>;
if (!/\+.*\$Id.*\$/) {
die "Expected some changed \$$key line: $_";
}
$skip_logs = 0;
} elsif (/^ .*\$Log.*\$/) {
$current_hunk .= $_;
$space++;
$skip_logs++;
} elsif (/^ /) {
$current_hunk .= $_;
$space++;
$skip_logs = 0;
} elsif (/^\+/) {
$current_hunk .= $_;
$plus++;
} elsif (/^-/) {
$current_hunk .= $_;
$minus++;
$skip_logs = 0;
} else {
die "Unexpected line: $_";
}
}
}
while (<>) {
if (/^diff/) {
do {
check_file $_;
} while(/^diff/);
} else {
printf $_;
}
}
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-22 21:39 ` Johannes Schindelin
@ 2007-07-22 23:45 ` Jon Smirl
2007-07-22 23:50 ` Johannes Schindelin
0 siblings, 1 reply; 24+ messages in thread
From: Jon Smirl @ 2007-07-22 23:45 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Git Mailing List
On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> As you already did, this is my attempt at a perl script... Feel free to
> bash my Perl capabilities, or to correct it...
I don't really know Perl but Perl is probably a better language for
this. I was doing it in C.
This doesn't run on the two full kernel samples I sent you. That's the
problem I was having, I can catch 95% of the expanded keywords with my
program but I have to touch up 5% by hand. I'll stare at this and see
if I can increase my understanding of Perl.
>
> Ciao,
> Dscho
>
> -- snipsnap --
> #!/usr/bin/perl
>
> sub init_hunk {
> $_ = $_[0];
> $current_hunk = "";
> $current_hunk_header = $_;
> ($start_minus, $dummy, $start_plus, $dummy) =
> /^\@\@ -(\d+)(,\d+|) \+(\d+)(,\d+|) \@\@/;
> $plus = $minus = $space = 0;
> $skip_logs = 0;
> }
>
> sub flush_hunk {
> if ($plus > 0 || $minus > 0) {
> if ($current_file ne "") {
> print $current_file;
> $current_file = "";
> }
> $minus += $space;
> $plus += $space;
> print "\@\@ -$start_minus,$minus "
> . "+$start_plus,$plus \@\@\n";
> print $current_hunk;
> }
> }
>
> sub check_file {
> $_ = $_[0];
> $current_file = $_;
> while (<>) {
> if (/^\@\@/) {
> last;
> }
> $current_file .= $_;
> }
>
> init_hunk $_;
>
> # check hunks
> while (<>) {
> if ($skip_logs && /^\+ *\*/) {
> # do nothing
> } elsif (/^\@\@.*/) {
> flush_hunk;
> init_hunk $_;
> } elsif (/^diff/) {
> flush_hunk;
> return;
> } elsif (/^-.*\$(Id|Revision|Author|Date).*\$/) {
> $key = $1;
> s/^-/ /;
> $current_hunk .= $_;
> $space++;
> $_ = <>;
> if (!/\+.*\$Id.*\$/) {
> die "Expected some changed \$$key line: $_";
> }
> $skip_logs = 0;
> } elsif (/^ .*\$Log.*\$/) {
> $current_hunk .= $_;
> $space++;
> $skip_logs++;
> } elsif (/^ /) {
> $current_hunk .= $_;
> $space++;
> $skip_logs = 0;
> } elsif (/^\+/) {
> $current_hunk .= $_;
> $plus++;
> } elsif (/^-/) {
> $current_hunk .= $_;
> $minus++;
> $skip_logs = 0;
> } else {
> die "Unexpected line: $_";
> }
> }
> }
>
> while (<>) {
> if (/^diff/) {
> do {
> check_file $_;
> } while(/^diff/);
> } else {
> printf $_;
> }
> }
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-22 23:45 ` Jon Smirl
@ 2007-07-22 23:50 ` Johannes Schindelin
2007-07-23 0:11 ` Jon Smirl
0 siblings, 1 reply; 24+ messages in thread
From: Johannes Schindelin @ 2007-07-22 23:50 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List
Hi,
On Sun, 22 Jul 2007, Jon Smirl wrote:
> On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > As you already did, this is my attempt at a perl script... Feel free to
> > bash my Perl capabilities, or to correct it...
>
> I don't really know Perl but Perl is probably a better language for
> this. I was doing it in C.
Yes, Perl is much much better for an application like this. You do not
have to cater for newbies, but can keep it to the technical level.
> This doesn't run on the two full kernel samples I sent you.
You mean my script? Or your C program?
> That's the problem I was having, I can catch 95% of the expanded
> keywords with my program but I have to touch up 5% by hand. I'll stare
> at this and see if I can increase my understanding of Perl.
If it is my script, please tell me what is not working. I'll fix it. In
the process, I could even add some comments...
Ciao,
Dscho
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-22 23:50 ` Johannes Schindelin
@ 2007-07-23 0:11 ` Jon Smirl
2007-07-23 0:37 ` Johannes Schindelin
0 siblings, 1 reply; 24+ messages in thread
From: Jon Smirl @ 2007-07-23 0:11 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Git Mailing List
On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > This doesn't run on the two full kernel samples I sent you.
>
> You mean my script? Or your C program?
Your script, I sent two big samples directly to you as attachments
three hours ago.
Script processes them for a few pages of output and then bails out.
I have more diffs, but those two are representative of the sample
phytec patch has support for the NXP 3180 CPU which is not in the
kernel. I have a project interested in using this CPU so we are
looking at extracting the essence of 3180 support from the patch and
trying to apply it to the current kernel.
>From the phytec patch:
diff -uarN linux-2.6.10/arch/arm/vfp/entry.S
linux-2.6.10-lpc3180/arch/arm/vfp/entry.S
--- linux-2.6.10/arch/arm/vfp/entry.S 2004-12-25 05:35:27.000000000 +0800
+++ linux-2.6.10-lpc3180/arch/arm/vfp/entry.S 2006-11-20
15:49:30.000000000 +0800
@@ -19,6 +19,7 @@
#include <linux/init.h>
#include <asm/thread_info.h>
#include <asm/vfpmacros.h>
+#include <asm/constants.h>
.globl do_vfp
do_vfp:
Expected some changed $Revision line: +#define DRIVER_VERSION "$Revision: 1.2 $"
jonsmirl@jonsmirl:/extra/diffs$
>From the sonos one:
diff --git a/Documentation/cachetlb.txt b/Documentation/cachetlb.txt
index 127a661..02dec92 100644
--- a/Documentation/cachetlb.txt
+++ b/Documentation/cachetlb.txt
@@ -222,7 +222,7 @@
this value.
NOTE: This does not fix shared mmaps, check out the sparc64 port for
-one way to solve this (in particular SPARC_FLAG_MMAPSHARED).
+one way to solve this (in particular arch_get_unmapped_area).
Next, you have two methods to solve the D-cache aliasing issue for all
other cases. Please keep in mind that fact that, for a given page
Expected some changed $Id line: Readme-File
/usr/src/Documentation/cdrom/aztcd
jonsmirl@jonsmirl:/extra/diffs$
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-23 0:11 ` Jon Smirl
@ 2007-07-23 0:37 ` Johannes Schindelin
2007-07-23 14:44 ` Jon Smirl
0 siblings, 1 reply; 24+ messages in thread
From: Johannes Schindelin @ 2007-07-23 0:37 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List
Hi,
On Sun, 22 Jul 2007, Jon Smirl wrote:
> On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > > This doesn't run on the two full kernel samples I sent you.
> >
> > You mean my script? Or your C program?
>
> Your script, I sent two big samples directly to you as attachments
> three hours ago.
Okay, I did not really test thoroughly, it seems. Sorry. Next try.
Ciao,
Dscho
-- snipsnap --
#!/usr/bin/perl
sub init_hunk {
$_ = $_[0];
$current_hunk = "";
$current_hunk_header = $_;
($start_minus, $dummy, $start_plus, $dummy) =
/^\@\@ -(\d+)(,\d+|) \+(\d+)(,\d+|) \@\@/;
$plus = $minus = $space = 0;
$skip_logs = 0;
$key = 0;
}
sub flush_hunk {
if ($plus > 0 || $minus > 0) {
if ($current_file ne "") {
print $current_file;
$current_file = "";
}
$minus += $space;
$plus += $space;
print "\@\@ -$start_minus,$minus "
. "+$start_plus,$plus \@\@\n";
print $current_hunk;
}
}
sub check_file {
$_ = $_[0];
$current_file = $_;
while (<>) {
if (/^\@\@/) {
last;
}
$current_file .= $_;
}
init_hunk $_;
# check hunks
while (<>) {
if ($skip_logs && /^\+ *\*/) {
# do nothing
} elsif ($key && /^\+.*\$$key.*\$/) {
# do nothing
} elsif (/^\@\@.*/) {
flush_hunk;
init_hunk $_;
} elsif (/^diff/) {
flush_hunk;
return;
} elsif (/^-.*\$(Id|Revision|Author|Date).*\$/) {
$key = $1;
s/^-/ /;
$current_hunk .= $_;
$space++;
$skip_logs = 0;
$key = 0;
} elsif (/^ .*\$Log.*\$/) {
$current_hunk .= $_;
$space++;
$skip_logs = 1;
$key = 0;
} elsif (/^ /) {
$current_hunk .= $_;
$space++;
$skip_logs = 0;
$key = 0;
} elsif (/^\+/) {
$current_hunk .= $_;
$plus++;
$key = 0;
} elsif (/^-/) {
$current_hunk .= $_;
$minus++;
$skip_logs = 0;
$key = 0;
} elsif (/^\\/) {
print $_;
} else {
die "Unexpected line: $_";
}
}
}
while (<>) {
if (/^diff/) {
do {
check_file $_;
} while(/^diff/);
} else {
printf $_;
}
}
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-23 0:37 ` Johannes Schindelin
@ 2007-07-23 14:44 ` Jon Smirl
2007-07-23 15:11 ` Simon 'corecode' Schubert
2007-07-23 20:11 ` Johannes Schindelin
0 siblings, 2 replies; 24+ messages in thread
From: Jon Smirl @ 2007-07-23 14:44 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Git Mailing List
On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> Okay, I did not really test thoroughly, it seems. Sorry. Next try.
It's working a lot better. People deal with $Id two different ways.
One way they delete all of the $Id lines, that is the case of the
Sonos patch. In the Phytec patch they left all of the $Id lines in
place which caused them to get modified. In both cases you just want
the lines with $Id to disappear in the patch.
It doesn't catch the $Id case from the Phytec patch.
diff -uarN linux-2.6.10/arch/cris/arch-v10/boot/rescue/head.S
linux-2.6.10-lpc3180/arch/cris/arch-v10/boot/rescue/head.S
--- linux-2.6.10/arch/cris/arch-v10/boot/rescue/head.S 2004-12-25
05:35:24.000000000 +0800
+++ linux-2.6.10-lpc3180/arch/cris/arch-v10/boot/rescue/head.S
2006-11-20 15:49:30.000000000 +0800
@@ -1,4 +1,5 @@
/* $Id: head.S,v 1.6 2003/04/09 08:12:43 pkj Exp $
+/* $Id: head.S,v 1.2 2005/02/18 13:06:31 mike Exp $
*
* Rescue code, made to reside at the beginning of the
* flash-memory. when it starts, it checks a partition
It's not catching all of the $Revision and $Date deltas.
The output diff shouldn't contain any CVS keywords. It is somewhat
tricky to catch all of the cases and fix up the diffs. This filter
should get written and debugged once and then made part of something
like git so that it doesn't get written over and over again. Perl is
way better for this I had 1000 lines of C in my program and it was
still missing 10% of the cases.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-23 14:44 ` Jon Smirl
@ 2007-07-23 15:11 ` Simon 'corecode' Schubert
2007-07-23 20:11 ` Johannes Schindelin
1 sibling, 0 replies; 24+ messages in thread
From: Simon 'corecode' Schubert @ 2007-07-23 15:11 UTC (permalink / raw)
To: Jon Smirl; +Cc: Johannes Schindelin, Git Mailing List
Jon Smirl wrote:
> On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
>> Okay, I did not really test thoroughly, it seems. Sorry. Next try.
>
> It's working a lot better. People deal with $Id two different ways.
> One way they delete all of the $Id lines, that is the case of the
> Sonos patch. In the Phytec patch they left all of the $Id lines in
> place which caused them to get modified. In both cases you just want
> the lines with $Id to disappear in the patch.
>
> It doesn't catch the $Id case from the Phytec patch.
>
> diff -uarN linux-2.6.10/arch/cris/arch-v10/boot/rescue/head.S
> linux-2.6.10-lpc3180/arch/cris/arch-v10/boot/rescue/head.S
> --- linux-2.6.10/arch/cris/arch-v10/boot/rescue/head.S 2004-12-25
> 05:35:24.000000000 +0800
> +++ linux-2.6.10-lpc3180/arch/cris/arch-v10/boot/rescue/head.S
> 2006-11-20 15:49:30.000000000 +0800
> @@ -1,4 +1,5 @@
> /* $Id: head.S,v 1.6 2003/04/09 08:12:43 pkj Exp $
> +/* $Id: head.S,v 1.2 2005/02/18 13:06:31 mike Exp $
> *
> * Rescue code, made to reside at the beginning of the
> * flash-memory. when it starts, it checks a partition
>
> It's not catching all of the $Revision and $Date deltas.
>
> The output diff shouldn't contain any CVS keywords. It is somewhat
> tricky to catch all of the cases and fix up the diffs. This filter
> should get written and debugged once and then made part of something
> like git so that it doesn't get written over and over again. Perl is
> way better for this I had 1000 lines of C in my program and it was
> still missing 10% of the cases.
Maybe I am missing something, but apart from $Log$, what's so hard about collapsing the CVS keywords? That's something like
s/\$(Id|Date|Header|CVSHeader|Author|Revision):[^$]*\$/$\1$/
or not?
I see that people want to modify the patch text to remove the churn, but that seems wrong. It would be much easier to just collapse the keywords before applying the patch, and then work on a newly created diff (from the git repo).
cheers
simon
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-23 14:44 ` Jon Smirl
2007-07-23 15:11 ` Simon 'corecode' Schubert
@ 2007-07-23 20:11 ` Johannes Schindelin
2007-07-24 0:43 ` Jon Smirl
1 sibling, 1 reply; 24+ messages in thread
From: Johannes Schindelin @ 2007-07-23 20:11 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List
Hi,
On Mon, 23 Jul 2007, Jon Smirl wrote:
> On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > Okay, I did not really test thoroughly, it seems. Sorry. Next try.
>
> It's working a lot better. People deal with $Id two different ways.
> One way they delete all of the $Id lines, that is the case of the
> Sonos patch. In the Phytec patch they left all of the $Id lines in
> place which caused them to get modified. In both cases you just want
> the lines with $Id to disappear in the patch.
>
> It doesn't catch the $Id case from the Phytec patch.
>
> diff -uarN linux-2.6.10/arch/cris/arch-v10/boot/rescue/head.S
> linux-2.6.10-lpc3180/arch/cris/arch-v10/boot/rescue/head.S
> --- linux-2.6.10/arch/cris/arch-v10/boot/rescue/head.S 2004-12-25
> 05:35:24.000000000 +0800
> +++ linux-2.6.10-lpc3180/arch/cris/arch-v10/boot/rescue/head.S
> 2006-11-20 15:49:30.000000000 +0800
> @@ -1,4 +1,5 @@
> /* $Id: head.S,v 1.6 2003/04/09 08:12:43 pkj Exp $
> +/* $Id: head.S,v 1.2 2005/02/18 13:06:31 mike Exp $
> *
> * Rescue code, made to reside at the beginning of the
> * flash-memory. when it starts, it checks a partition
>
> It's not catching all of the $Revision and $Date deltas.
>
> The output diff shouldn't contain any CVS keywords. It is somewhat
> tricky to catch all of the cases and fix up the diffs. This filter
> should get written and debugged once and then made part of something
> like git so that it doesn't get written over and over again. Perl is
> way better for this I had 1000 lines of C in my program and it was
> still missing 10% of the cases.
Next try. It includes documentation, skips $Source:$ and $Header:$ as
well, and the output diff does not contain any CVS keywords.
Ah yes, and if a file was actually created, it substitutes the --- file
name with /dev/null. In that case, the keywords are skipped, too!
However, when a file was deleted, they are _not_ skipped.
Strictly speaking, it is wrong to just cut out the lines containing CVS
keywords, since it is possible that there is an important change on that
line, but we can cross that bridge once companies play such cute games
with us.
But at least I tested the resulting patch with your phytec3180 patch this
time...
Ciao,
Dscho
-- snipsnap --
#!/usr/bin/perl
# This is a simple state machine.
#
# There is the state of the current file; its header is stored
# in $current_file to avoid outputting it when all hunks were
# culled. It is only printed before the first hunk, and then
# set to "" to avoid outputting it twice.
#
# There are the states of the current hunk, stored in
# * $current_hunk (possibly modified hunk)
# * $start_minus, $start_plus (from the original)
# * $plus, $minus, $space (the current count of the respective lines)
# a hunk is only printed (in flush_hunk) if any '+' or '-' lines
# are left after filtering.
#
# For $Log..$, there is the state $skip_logs, which is set to 1
# after seeing such a line, and set to 0 when the first line
# was seen which does not begin with '+'.
#
# A particularly nasty special case is when a single "*" was
# misattributed by the diff to be _inserted before_ a $Log, instead
# of _appended after_ a $Log.
# This is the purpose of the $before_log and $after_log variables:
# if not empty, the state machine expects the next line to begin
# with '+' or '-', respectively, and followed by a $Log. If this
# expectation is not met, the variable is output.
#
# The variable $plus_minus_adjust contains the number of lines which
# were skipped from the "+" side, so that the correct offset is shown.
# This function gets a hunk header.
#
# It initializes the state variables described above
sub init_hunk {
$_ = $_[0];
$current_hunk = "";
$current_hunk_header = $_;
($start_minus, $dummy, $start_plus, $dummy) =
/^\@\@ -(\d+)(,\d+|) \+(\d+)(,\d+|) \@\@/;
$plus = $minus = $space = 0;
$skip_logs = 0;
$before_log = '';
$after_log = '';
# we prefer /dev/null as original file name when a file is new
if ($start_minus eq 0) {
$current_file =~ s/\n--- .*\n/\n--- \/dev\/null\n/;
} elsif ($start_plus eq 0) {
$current_file =~ s/\n\+\+\+ .*\n/\n+++ \/dev\/null\n/;
}
}
# This function is called whenever there is possibly a hunk to print.
# Nothing is printed if no '+' or '-' lines are left.
# Otherwise, if the file header was not yet shown, it does so now.
sub flush_hunk {
if ($plus > 0 || $minus > 0) {
if ($current_file ne "") {
print $current_file;
$current_file = "";
}
$minus += $space;
$plus += $space;
print "\@\@ -$start_minus,$minus "
. "+" . ($start_plus - $start_plus_adjust)
. ",$plus \@\@\n";
print $current_hunk;
}
}
# This adds a line to the current hunk and updates $space, $plus or $minus
sub add_line {
my $line = $_[0];
$current_hunk .= $line;
if ($line =~ /^ /) {
$space++;
} elsif ($line =~ /^\+/) {
$plus++;
} elsif ($line =~ /^-/) {
$minus++;
} elsif ($line =~ /^\\/) {
# do nothing
} else {
die "Unexpected line: $line";
}
}
# This function splits the current hunk into the part before the current
# line, and the part after the current line.
sub skip_line {
my $line = $_[0];
if ($start_minus == 0) {
# This patch adds a new file, just ignore that line
return;
} elsif ($start_plus == 0) {
# This patch removes a file, so include the line nevertheless
add_line $_;
return;
}
flush_hunk;
if ($line =~ /^-/) {
$minus++;
} elsif ($line =~ /^\+/) {
$plus++;
$start_plus_adjust++;
}
init_hunk "@@ -" . ($start_minus + $minus + $space)
. " +" . ($start_plus + $plus + $space)
. " @@\n";
}
$simple_keyword = "Id|Revision|Author|Date|Source|Header";
# This is the main loop
sub check_file {
$_ = $_[0];
$current_file = $_;
$start_plus_adjust = 0;
while (<>) {
if (/^\@\@/) {
last;
}
$current_file .= $_;
}
init_hunk $_;
# check hunks
while (<>) {
if ($before_log) {
if (!/\+.*\$Log.*\$/) {
add_line $before_log;
} else {
skip_line $before_log;
}
$before_log = '';
}
if ($after_log) {
if (!/-.*\$Log.*\$/) {
add_line $after_log;
} else {
skip_line $after_log;
}
$after_log = '';
}
if ($skip_logs) {
if (/^\+/) {
skip_line $_;
$skip_logs = 1;
} else {
$skip_logs = 0;
if (/^ *\*$/) {
$after_log = $_;
}
}
} elsif (/^\+.*\$($simple_keyword).*\$/) {
skip_line $_;
} elsif (/^\@\@.*/) {
flush_hunk;
init_hunk $_;
} elsif (/^diff/) {
flush_hunk;
return;
} elsif (/^-.*\$($simple_keyword).*\$/) {
# fake new hunk
skip_line $_;
} elsif (/^\+ *\*$/) {
$before_log = $_;
} elsif (/^([- \+]).*\$Log.*\$/) {
skip_line $_;
$skip_logs = 1;
} else {
add_line $_;
}
}
}
# This loop just shows everything before the first diff, and then hands
# over to check_file whenever it sees a line beginning with "diff".
while (<>) {
if (/^diff/) {
do {
check_file $_;
} while(/^diff/);
flush_hunk;
} else {
printf $_;
}
}
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-23 20:11 ` Johannes Schindelin
@ 2007-07-24 0:43 ` Jon Smirl
2007-07-24 1:06 ` Jon Smirl
2007-07-24 1:16 ` Johannes Schindelin
0 siblings, 2 replies; 24+ messages in thread
From: Jon Smirl @ 2007-07-24 0:43 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Git Mailing List
Thanks for working on this. I'd like to see it added to git toolkit.
This diff is way easier to read now. I can see that it has SPI support
backported from some future version.
But... it still has some problems.
For the phytec patch it's not getting the $Log changes in the qlogic
files right.
I'm checking the output diff line by line for problems. It's down to
11,528 lines from 88,787. That's a lot of junk removed, I'll have to
make sure it isn't removing too much.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-24 0:43 ` Jon Smirl
@ 2007-07-24 1:06 ` Jon Smirl
2007-07-24 1:16 ` Johannes Schindelin
1 sibling, 0 replies; 24+ messages in thread
From: Jon Smirl @ 2007-07-24 1:06 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Git Mailing List
On 7/23/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> Thanks for working on this. I'd like to see it added to git toolkit.
> This diff is way easier to read now. I can see that it has SPI support
> backported from some future version.
>
> But... it still has some problems.
> For the phytec patch it's not getting the $Log changes in the qlogic
> files right.
>
> I'm checking the output diff line by line for problems. It's down to
> 11,528 lines from 88,787. That's a lot of junk removed, I'll have to
> make sure it isn't removing too much.
I forgot about newly added files, looks like only 20,000 lines of CVS noise.
The phytec patch has several ia64 files added to it, those are obvious
not part of the support and an ARM9 processor. Never noticed those
before.
>
> --
> Jon Smirl
> jonsmirl@gmail.com
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-24 0:43 ` Jon Smirl
2007-07-24 1:06 ` Jon Smirl
@ 2007-07-24 1:16 ` Johannes Schindelin
2007-07-24 10:16 ` Jakub Narebski
1 sibling, 1 reply; 24+ messages in thread
From: Johannes Schindelin @ 2007-07-24 1:16 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List
Hi,
On Mon, 23 Jul 2007, Jon Smirl wrote:
> Thanks for working on this. I'd like to see it added to git toolkit.
I plan to submit it to patchutils instead, since this is not really
dependent on git.
> This diff is way easier to read now. I can see that it has SPI support
> backported from some future version.
>
> But... it still has some problems. For the phytec patch it's not getting
> the $Log changes in the qlogic files right.
You are correct. This slipped by my eyes.
> I'm checking the output diff line by line for problems. It's down to
> 11,528 lines from 88,787. That's a lot of junk removed, I'll have to
> make sure it isn't removing too much.
May I ask you to test this version?
Thanks,
Dscho
-- snipsnap --
#!/usr/bin/perl
# This is a simple state machine.
#
# There is the state of the current file; its header is stored
# in $current_file to avoid outputting it when all hunks were
# culled. It is only printed before the first hunk, and then
# set to "" to avoid outputting it twice.
#
# There are the states of the current hunk, stored in
# * $current_hunk (possibly modified hunk)
# * $start_minus, $start_plus (from the original)
# * $plus, $minus, $space (the current count of the respective lines)
# a hunk is only printed (in flush_hunk) if any '+' or '-' lines
# are left after filtering.
#
# For $Log..$, there is the state $skip_logs, which is set to 1
# after seeing such a line, and set to 0 when the first line
# was seen which does not begin with '+'.
#
# A particularly nasty special case is when a single "*" was
# misattributed by the diff to be _inserted before_ a $Log, instead
# of _appended after_ a $Log.
# This is the purpose of the $before_log and $after_log variables:
# if not empty, the state machine expects the next line to begin
# with '+' or '-', respectively, and followed by a $Log. If this
# expectation is not met, the variable is output.
#
# The variable $plus_minus_adjust contains the number of lines which
# were skipped from the "+" side, so that the correct offset is shown.
# This function gets a hunk header.
#
# It initializes the state variables described above
sub init_hunk {
my $line = $_[0];
$current_hunk = "";
($start_minus, $dummy, $start_plus, $dummy) =
($line =~ /^\@\@ -(\d+)(,\d+|) \+(\d+)(,\d+|) \@\@/);
$plus = $minus = $space = 0;
$skip_logs = 0;
$before_log = '';
$after_log = '';
# we prefer /dev/null as original file name when a file is new
if ($start_minus eq 0) {
$current_file =~ s/\n--- .*\n/\n--- \/dev\/null\n/;
} elsif ($start_plus eq 0) {
$current_file =~ s/\n\+\+\+ .*\n/\n+++ \/dev\/null\n/;
}
}
# This function is called whenever there is possibly a hunk to print.
# Nothing is printed if no '+' or '-' lines are left.
# Otherwise, if the file header was not yet shown, it does so now.
sub flush_hunk {
if (($plus > 0 || $minus > 0) && $current_hunk ne '') {
if ($current_file ne "") {
print $current_file;
$current_file = "";
}
$minus += $space;
$plus += $space;
print "\@\@ -$start_minus,$minus "
. "+" . ($start_plus - $start_plus_adjust)
. ",$plus \@\@\n";
print $current_hunk;
$current_hunk = '';
}
}
# This adds a line to the current hunk and updates $space, $plus or $minus
sub add_line {
my $line = $_[0];
$current_hunk .= $line;
if ($line =~ /^ /) {
$space++;
} elsif ($line =~ /^\+/) {
$plus++;
} elsif ($line =~ /^-/) {
$minus++;
} elsif ($line =~ /^\\/) {
# do nothing
} else {
die "Unexpected line: $line";
}
}
# This function splits the current hunk into the part before the current
# line, and the part after the current line.
sub skip_line {
my $line = $_[0];
if ($start_minus == 0) {
# This patch adds a new file, just ignore that line
return;
} elsif ($start_plus == 0) {
# This patch removes a file, so include the line nevertheless
add_line $_;
return;
}
flush_hunk;
if ($line =~ /^-/) {
$minus++;
} elsif ($line =~ /^\+/) {
$plus++;
$start_plus_adjust++;
}
init_hunk "@@ -" . ($start_minus + $minus + $space)
. " +" . ($start_plus + $plus + $space)
. " @@\n";
}
$simple_keyword = "Id|Revision|Author|Date|Source|Header";
# This is the main loop
sub check_file {
$_ = $_[0];
$current_file = $_;
$start_plus_adjust = 0;
while (<>) {
if (/^\@\@/) {
last;
}
$current_file .= $_;
}
init_hunk $_;
# check hunks
while (<>) {
if ($before_log) {
if (!/\+.*\$Log.*\$/) {
add_line $before_log;
} else {
skip_line $before_log;
}
$before_log = '';
}
if ($after_log) {
if (!/-.*\$Log.*\$/) {
add_line $after_log;
} else {
skip_line $after_log;
}
$after_log = '';
}
if ($skip_logs) {
if (/^\+/) {
skip_line $_;
$skip_logs = 1;
} else {
$skip_logs = 0;
if (/^ *\*$/) {
$after_log = $_;
}
}
} elsif (/^\+.*\$($simple_keyword).*\$/) {
skip_line $_;
} elsif (/^\@\@.*/) {
flush_hunk;
init_hunk $_;
} elsif (/^diff/) {
flush_hunk;
return;
} elsif (/^-.*\$($simple_keyword).*\$/) {
# fake new hunk
skip_line $_;
} elsif (/^\+ *\*$/) {
$before_log = $_;
} elsif (/^([- \+]).*\$Log.*\$/) {
skip_line $_;
$skip_logs = 1;
} else {
add_line $_;
}
}
}
# This loop just shows everything before the first diff, and then hands
# over to check_file whenever it sees a line beginning with "diff".
while (<>) {
if (/^diff/) {
do {
check_file $_;
} while(/^diff/);
} else {
printf $_;
}
}
flush_hunk;
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-24 1:16 ` Johannes Schindelin
@ 2007-07-24 10:16 ` Jakub Narebski
2007-07-24 11:26 ` Johannes Schindelin
0 siblings, 1 reply; 24+ messages in thread
From: Jakub Narebski @ 2007-07-24 10:16 UTC (permalink / raw)
To: git
[Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>, git@vger.kernel.org]
Johannes Schindelin wrote:
> On Mon, 23 Jul 2007, Jon Smirl wrote:
>
>> Thanks for working on this. I'd like to see it added to git toolkit.
>
> I plan to submit it to patchutils instead, since this is not really
> dependent on git.
Could you also add it to contrib/ area, please?
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-24 10:16 ` Jakub Narebski
@ 2007-07-24 11:26 ` Johannes Schindelin
2007-07-24 13:08 ` Jon Smirl
0 siblings, 1 reply; 24+ messages in thread
From: Johannes Schindelin @ 2007-07-24 11:26 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git, Jon Smirl
Hi,
On Tue, 24 Jul 2007, Jakub Narebski wrote:
> [Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>, git@vger.kernel.org]
Thanks.
> Johannes Schindelin wrote:
>
> > On Mon, 23 Jul 2007, Jon Smirl wrote:
> >
> >> Thanks for working on this. I'd like to see it added to git toolkit.
> >
> > I plan to submit it to patchutils instead, since this is not really
> > dependent on git.
>
> Could you also add it to contrib/ area, please?
Sure. Once it is really all fleshed out. But I really think that
patchutils is a better place. That way, the script also helps the poor
souls stuck with Mercurial or Darcs :-)
The really tedious part is the testing, and the verifying. Fortunately,
Jon made up for my incapability in both areas, with an incredible
patience.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion
2007-07-24 11:26 ` Johannes Schindelin
@ 2007-07-24 13:08 ` Jon Smirl
0 siblings, 0 replies; 24+ messages in thread
From: Jon Smirl @ 2007-07-24 13:08 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Jakub Narebski, git
On 7/24/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> The really tedious part is the testing, and the verifying. Fortunately,
> Jon made up for my incapability in both areas, with an incredible
> patience.
I haven't finished checking every last line yet, if anything is left
it is small. We lost power here all evening last night. That's not
supposed to happen in urban Boston. It was black over a mile from me
in all directions.
>
> Ciao,
> Dscho
>
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2007-07-24 13:08 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-22 18:48 Git help for kernel archeology, suppress diffs caused by CVS keyword expansion Jon Smirl
2007-07-22 19:00 ` Johannes Schindelin
2007-07-22 19:10 ` Jon Smirl
2007-07-22 19:15 ` Johannes Schindelin
2007-07-22 19:48 ` Jon Smirl
2007-07-22 21:39 ` Johannes Schindelin
2007-07-22 23:45 ` Jon Smirl
2007-07-22 23:50 ` Johannes Schindelin
2007-07-23 0:11 ` Jon Smirl
2007-07-23 0:37 ` Johannes Schindelin
2007-07-23 14:44 ` Jon Smirl
2007-07-23 15:11 ` Simon 'corecode' Schubert
2007-07-23 20:11 ` Johannes Schindelin
2007-07-24 0:43 ` Jon Smirl
2007-07-24 1:06 ` Jon Smirl
2007-07-24 1:16 ` Johannes Schindelin
2007-07-24 10:16 ` Jakub Narebski
2007-07-24 11:26 ` Johannes Schindelin
2007-07-24 13:08 ` Jon Smirl
2007-07-22 19:09 ` Linus Torvalds
2007-07-22 19:12 ` Jon Smirl
2007-07-22 19:37 ` Linus Torvalds
2007-07-22 19:17 ` David Kastrup
2007-07-22 19:47 ` Jan Engelhardt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).