* Git help for kernel archeology, suppress diffs caused by CVS keyword expansion @ 2007-07-22 18:48 Jon Smirl 2007-07-22 19:00 ` Johannes Schindelin ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: Jon Smirl @ 2007-07-22 18:48 UTC (permalink / raw) To: Git Mailing List, lkml It would really be useful if git diff had an option for suppressing diffs caused by CVS keyword expansion. I run into this problem over and over when trying to recover stuff out of old kernel sources that people checked into CVS and then posted CVS diffs to fulfill their GPL obligations. I sometimes wonder if vendors are doing this on purpose to make it more difficult to recover the changes they made to the code. To prevent this in the future, I'd love to see a patch removing all CVS keywords from the kernel sources. Quick grep of kernel shows 1,535 $Id, 197 $Log, 441 $Revision, 144 $Date. I guestimate that this would remove about 5,000 lines form the kernel source and touch 1,700 files. Would this be accept or do those expansions contain useful info? Example diff caused by $Log expansion: diff --git a/arch/cris/kernel/ptrace.c b/arch/cris/kernel/ptrace.c index b3d10fb..32e47ca 100644 --- a/arch/cris/kernel/ptrace.c +++ b/arch/cris/kernel/ptrace.c @@ -8,6 +8,9 @@ * Authors: Bjorn Wesen * * $Log: ptrace.c,v $ + * Revision 1.2 2003/06/26 21:08:16 vangool + * Commit kernel changes. + * * Revision 1.8 2001/11/12 18:26:21 pkj * Fixed compiler warnings. * Or in this case the person edited out all of the lines in the kernel containing $Id diff --git a/arch/i386/boot/tools/build.c b/arch/i386/boot/tools/build.c index 2edd0a4..792aeb1 100644 --- a/arch/i386/boot/tools/build.c +++ b/arch/i386/boot/tools/build.c @@ -1,5 +1,4 @@ /* - * $Id: build.c,v 1.5 1997/05/19 12:29:58 mj Exp $ * * Copyright (C) 1991, 1992 Linus Torvalds * Copyright (C) 1997 Martin Mares A variation on this where the Id was updated.. - * $Id: build.c,v 1.5 1997/05/19 12:29:58 mj Exp $ + * $Id: build.c,v 1.5 2002/09/29 12:29:58 mj Exp $ There are a more CVS keywords that can cause expansion, probably around twenty. When removing these with a simple grep it can get fooled and mess up. CVS keyword expansion adds about 60,000 lines of noise to a typical diff. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-22 18:48 Git help for kernel archeology, suppress diffs caused by CVS keyword expansion Jon Smirl @ 2007-07-22 19:00 ` Johannes Schindelin 2007-07-22 19:10 ` Jon Smirl 2007-07-22 19:09 ` Linus Torvalds 2007-07-22 19:47 ` Jan Engelhardt 2 siblings, 1 reply; 24+ messages in thread From: Johannes Schindelin @ 2007-07-22 19:00 UTC (permalink / raw) To: Jon Smirl; +Cc: Git Mailing List Hi, [culling lkml list, since this is really a Git problem] On Sun, 22 Jul 2007, Jon Smirl wrote: > It would really be useful if git diff had an option for suppressing > diffs caused by CVS keyword expansion. Looks to me slightly a bit like an XY problem. How about using git-filter-branch to get rid of the expansions? Or even better, if you have access to the CVS server, use the -k option to cvsimport? Hth, Dscho ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-22 19:00 ` Johannes Schindelin @ 2007-07-22 19:10 ` Jon Smirl 2007-07-22 19:15 ` Johannes Schindelin 0 siblings, 1 reply; 24+ messages in thread From: Jon Smirl @ 2007-07-22 19:10 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Git Mailing List On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Hi, > > [culling lkml list, since this is really a Git problem] > > On Sun, 22 Jul 2007, Jon Smirl wrote: > > > It would really be useful if git diff had an option for suppressing > > diffs caused by CVS keyword expansion. > > Looks to me slightly a bit like an XY problem. > > How about using git-filter-branch to get rid of the expansions? Or even > better, if you have access to the CVS server, use the -k option to > cvsimport? No access to the server the people handing over the diffs and not making things easy. You get diffs like this when asking companies to turn over their changes for GPL compliance purposes. A big source is from companies building embedded systems., Usually the diffs are pretty worthless but every once in a while they contain very useful nuggets (like inadvertently documenting secret hardware features). It is hard to spot the nuggets in all of the noise. > > Hth, > Dscho > > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-22 19:10 ` Jon Smirl @ 2007-07-22 19:15 ` Johannes Schindelin 2007-07-22 19:48 ` Jon Smirl 0 siblings, 1 reply; 24+ messages in thread From: Johannes Schindelin @ 2007-07-22 19:15 UTC (permalink / raw) To: Jon Smirl; +Cc: Git Mailing List Hi, On Sun, 22 Jul 2007, Jon Smirl wrote: > On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > > On Sun, 22 Jul 2007, Jon Smirl wrote: > > > > > It would really be useful if git diff had an option for suppressing > > > diffs caused by CVS keyword expansion. > > > > Looks to me slightly a bit like an XY problem. > > > > How about using git-filter-branch to get rid of the expansions? Or > > even better, if you have access to the CVS server, use the -k option > > to cvsimport? > > No access to the server the people handing over the diffs and not > making things easy. Ah yes, that's what I feared. Darn. But still, I think that it would be much better not to put this into Git. We do have diff gitattributes now, so that you can roll your own diff for specific files, but I still think that this is more a task for a standalone perl script. Possibly being called from filter-branch to be done with the conversion once and for all times. Ciao, Dscho ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-22 19:15 ` Johannes Schindelin @ 2007-07-22 19:48 ` Jon Smirl 2007-07-22 21:39 ` Johannes Schindelin 0 siblings, 1 reply; 24+ messages in thread From: Jon Smirl @ 2007-07-22 19:48 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Git Mailing List On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > But still, I think that it would be much better not to put this into Git. > We do have diff gitattributes now, so that you can roll your own diff for > specific files, but I still think that this is more a task for a > standalone perl script. Possibly being called from filter-branch to be > done with the conversion once and for all times. I can provide sample diffs if you want something to play with. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-22 19:48 ` Jon Smirl @ 2007-07-22 21:39 ` Johannes Schindelin 2007-07-22 23:45 ` Jon Smirl 0 siblings, 1 reply; 24+ messages in thread From: Johannes Schindelin @ 2007-07-22 21:39 UTC (permalink / raw) To: Jon Smirl; +Cc: Git Mailing List Hi, On Sun, 22 Jul 2007, Jon Smirl wrote: > On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > But still, I think that it would be much better not to put this into Git. > > We do have diff gitattributes now, so that you can roll your own diff for > > specific files, but I still think that this is more a task for a > > standalone perl script. Possibly being called from filter-branch to be > > done with the conversion once and for all times. > > I can provide sample diffs if you want something to play with. As you already did, this is my attempt at a perl script... Feel free to bash my Perl capabilities, or to correct it... Ciao, Dscho -- snipsnap -- #!/usr/bin/perl sub init_hunk { $_ = $_[0]; $current_hunk = ""; $current_hunk_header = $_; ($start_minus, $dummy, $start_plus, $dummy) = /^\@\@ -(\d+)(,\d+|) \+(\d+)(,\d+|) \@\@/; $plus = $minus = $space = 0; $skip_logs = 0; } sub flush_hunk { if ($plus > 0 || $minus > 0) { if ($current_file ne "") { print $current_file; $current_file = ""; } $minus += $space; $plus += $space; print "\@\@ -$start_minus,$minus " . "+$start_plus,$plus \@\@\n"; print $current_hunk; } } sub check_file { $_ = $_[0]; $current_file = $_; while (<>) { if (/^\@\@/) { last; } $current_file .= $_; } init_hunk $_; # check hunks while (<>) { if ($skip_logs && /^\+ *\*/) { # do nothing } elsif (/^\@\@.*/) { flush_hunk; init_hunk $_; } elsif (/^diff/) { flush_hunk; return; } elsif (/^-.*\$(Id|Revision|Author|Date).*\$/) { $key = $1; s/^-/ /; $current_hunk .= $_; $space++; $_ = <>; if (!/\+.*\$Id.*\$/) { die "Expected some changed \$$key line: $_"; } $skip_logs = 0; } elsif (/^ .*\$Log.*\$/) { $current_hunk .= $_; $space++; $skip_logs++; } elsif (/^ /) { $current_hunk .= $_; $space++; $skip_logs = 0; } elsif (/^\+/) { $current_hunk .= $_; $plus++; } elsif (/^-/) { $current_hunk .= $_; $minus++; $skip_logs = 0; } else { die "Unexpected line: $_"; } } } while (<>) { if (/^diff/) { do { check_file $_; } while(/^diff/); } else { printf $_; } } ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-22 21:39 ` Johannes Schindelin @ 2007-07-22 23:45 ` Jon Smirl 2007-07-22 23:50 ` Johannes Schindelin 0 siblings, 1 reply; 24+ messages in thread From: Jon Smirl @ 2007-07-22 23:45 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Git Mailing List On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > As you already did, this is my attempt at a perl script... Feel free to > bash my Perl capabilities, or to correct it... I don't really know Perl but Perl is probably a better language for this. I was doing it in C. This doesn't run on the two full kernel samples I sent you. That's the problem I was having, I can catch 95% of the expanded keywords with my program but I have to touch up 5% by hand. I'll stare at this and see if I can increase my understanding of Perl. > > Ciao, > Dscho > > -- snipsnap -- > #!/usr/bin/perl > > sub init_hunk { > $_ = $_[0]; > $current_hunk = ""; > $current_hunk_header = $_; > ($start_minus, $dummy, $start_plus, $dummy) = > /^\@\@ -(\d+)(,\d+|) \+(\d+)(,\d+|) \@\@/; > $plus = $minus = $space = 0; > $skip_logs = 0; > } > > sub flush_hunk { > if ($plus > 0 || $minus > 0) { > if ($current_file ne "") { > print $current_file; > $current_file = ""; > } > $minus += $space; > $plus += $space; > print "\@\@ -$start_minus,$minus " > . "+$start_plus,$plus \@\@\n"; > print $current_hunk; > } > } > > sub check_file { > $_ = $_[0]; > $current_file = $_; > while (<>) { > if (/^\@\@/) { > last; > } > $current_file .= $_; > } > > init_hunk $_; > > # check hunks > while (<>) { > if ($skip_logs && /^\+ *\*/) { > # do nothing > } elsif (/^\@\@.*/) { > flush_hunk; > init_hunk $_; > } elsif (/^diff/) { > flush_hunk; > return; > } elsif (/^-.*\$(Id|Revision|Author|Date).*\$/) { > $key = $1; > s/^-/ /; > $current_hunk .= $_; > $space++; > $_ = <>; > if (!/\+.*\$Id.*\$/) { > die "Expected some changed \$$key line: $_"; > } > $skip_logs = 0; > } elsif (/^ .*\$Log.*\$/) { > $current_hunk .= $_; > $space++; > $skip_logs++; > } elsif (/^ /) { > $current_hunk .= $_; > $space++; > $skip_logs = 0; > } elsif (/^\+/) { > $current_hunk .= $_; > $plus++; > } elsif (/^-/) { > $current_hunk .= $_; > $minus++; > $skip_logs = 0; > } else { > die "Unexpected line: $_"; > } > } > } > > while (<>) { > if (/^diff/) { > do { > check_file $_; > } while(/^diff/); > } else { > printf $_; > } > } > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-22 23:45 ` Jon Smirl @ 2007-07-22 23:50 ` Johannes Schindelin 2007-07-23 0:11 ` Jon Smirl 0 siblings, 1 reply; 24+ messages in thread From: Johannes Schindelin @ 2007-07-22 23:50 UTC (permalink / raw) To: Jon Smirl; +Cc: Git Mailing List Hi, On Sun, 22 Jul 2007, Jon Smirl wrote: > On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > As you already did, this is my attempt at a perl script... Feel free to > > bash my Perl capabilities, or to correct it... > > I don't really know Perl but Perl is probably a better language for > this. I was doing it in C. Yes, Perl is much much better for an application like this. You do not have to cater for newbies, but can keep it to the technical level. > This doesn't run on the two full kernel samples I sent you. You mean my script? Or your C program? > That's the problem I was having, I can catch 95% of the expanded > keywords with my program but I have to touch up 5% by hand. I'll stare > at this and see if I can increase my understanding of Perl. If it is my script, please tell me what is not working. I'll fix it. In the process, I could even add some comments... Ciao, Dscho ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-22 23:50 ` Johannes Schindelin @ 2007-07-23 0:11 ` Jon Smirl 2007-07-23 0:37 ` Johannes Schindelin 0 siblings, 1 reply; 24+ messages in thread From: Jon Smirl @ 2007-07-23 0:11 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Git Mailing List On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > This doesn't run on the two full kernel samples I sent you. > > You mean my script? Or your C program? Your script, I sent two big samples directly to you as attachments three hours ago. Script processes them for a few pages of output and then bails out. I have more diffs, but those two are representative of the sample phytec patch has support for the NXP 3180 CPU which is not in the kernel. I have a project interested in using this CPU so we are looking at extracting the essence of 3180 support from the patch and trying to apply it to the current kernel. >From the phytec patch: diff -uarN linux-2.6.10/arch/arm/vfp/entry.S linux-2.6.10-lpc3180/arch/arm/vfp/entry.S --- linux-2.6.10/arch/arm/vfp/entry.S 2004-12-25 05:35:27.000000000 +0800 +++ linux-2.6.10-lpc3180/arch/arm/vfp/entry.S 2006-11-20 15:49:30.000000000 +0800 @@ -19,6 +19,7 @@ #include <linux/init.h> #include <asm/thread_info.h> #include <asm/vfpmacros.h> +#include <asm/constants.h> .globl do_vfp do_vfp: Expected some changed $Revision line: +#define DRIVER_VERSION "$Revision: 1.2 $" jonsmirl@jonsmirl:/extra/diffs$ >From the sonos one: diff --git a/Documentation/cachetlb.txt b/Documentation/cachetlb.txt index 127a661..02dec92 100644 --- a/Documentation/cachetlb.txt +++ b/Documentation/cachetlb.txt @@ -222,7 +222,7 @@ this value. NOTE: This does not fix shared mmaps, check out the sparc64 port for -one way to solve this (in particular SPARC_FLAG_MMAPSHARED). +one way to solve this (in particular arch_get_unmapped_area). Next, you have two methods to solve the D-cache aliasing issue for all other cases. Please keep in mind that fact that, for a given page Expected some changed $Id line: Readme-File /usr/src/Documentation/cdrom/aztcd jonsmirl@jonsmirl:/extra/diffs$ -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-23 0:11 ` Jon Smirl @ 2007-07-23 0:37 ` Johannes Schindelin 2007-07-23 14:44 ` Jon Smirl 0 siblings, 1 reply; 24+ messages in thread From: Johannes Schindelin @ 2007-07-23 0:37 UTC (permalink / raw) To: Jon Smirl; +Cc: Git Mailing List Hi, On Sun, 22 Jul 2007, Jon Smirl wrote: > On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > > This doesn't run on the two full kernel samples I sent you. > > > > You mean my script? Or your C program? > > Your script, I sent two big samples directly to you as attachments > three hours ago. Okay, I did not really test thoroughly, it seems. Sorry. Next try. Ciao, Dscho -- snipsnap -- #!/usr/bin/perl sub init_hunk { $_ = $_[0]; $current_hunk = ""; $current_hunk_header = $_; ($start_minus, $dummy, $start_plus, $dummy) = /^\@\@ -(\d+)(,\d+|) \+(\d+)(,\d+|) \@\@/; $plus = $minus = $space = 0; $skip_logs = 0; $key = 0; } sub flush_hunk { if ($plus > 0 || $minus > 0) { if ($current_file ne "") { print $current_file; $current_file = ""; } $minus += $space; $plus += $space; print "\@\@ -$start_minus,$minus " . "+$start_plus,$plus \@\@\n"; print $current_hunk; } } sub check_file { $_ = $_[0]; $current_file = $_; while (<>) { if (/^\@\@/) { last; } $current_file .= $_; } init_hunk $_; # check hunks while (<>) { if ($skip_logs && /^\+ *\*/) { # do nothing } elsif ($key && /^\+.*\$$key.*\$/) { # do nothing } elsif (/^\@\@.*/) { flush_hunk; init_hunk $_; } elsif (/^diff/) { flush_hunk; return; } elsif (/^-.*\$(Id|Revision|Author|Date).*\$/) { $key = $1; s/^-/ /; $current_hunk .= $_; $space++; $skip_logs = 0; $key = 0; } elsif (/^ .*\$Log.*\$/) { $current_hunk .= $_; $space++; $skip_logs = 1; $key = 0; } elsif (/^ /) { $current_hunk .= $_; $space++; $skip_logs = 0; $key = 0; } elsif (/^\+/) { $current_hunk .= $_; $plus++; $key = 0; } elsif (/^-/) { $current_hunk .= $_; $minus++; $skip_logs = 0; $key = 0; } elsif (/^\\/) { print $_; } else { die "Unexpected line: $_"; } } } while (<>) { if (/^diff/) { do { check_file $_; } while(/^diff/); } else { printf $_; } } ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-23 0:37 ` Johannes Schindelin @ 2007-07-23 14:44 ` Jon Smirl 2007-07-23 15:11 ` Simon 'corecode' Schubert 2007-07-23 20:11 ` Johannes Schindelin 0 siblings, 2 replies; 24+ messages in thread From: Jon Smirl @ 2007-07-23 14:44 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Git Mailing List On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > Okay, I did not really test thoroughly, it seems. Sorry. Next try. It's working a lot better. People deal with $Id two different ways. One way they delete all of the $Id lines, that is the case of the Sonos patch. In the Phytec patch they left all of the $Id lines in place which caused them to get modified. In both cases you just want the lines with $Id to disappear in the patch. It doesn't catch the $Id case from the Phytec patch. diff -uarN linux-2.6.10/arch/cris/arch-v10/boot/rescue/head.S linux-2.6.10-lpc3180/arch/cris/arch-v10/boot/rescue/head.S --- linux-2.6.10/arch/cris/arch-v10/boot/rescue/head.S 2004-12-25 05:35:24.000000000 +0800 +++ linux-2.6.10-lpc3180/arch/cris/arch-v10/boot/rescue/head.S 2006-11-20 15:49:30.000000000 +0800 @@ -1,4 +1,5 @@ /* $Id: head.S,v 1.6 2003/04/09 08:12:43 pkj Exp $ +/* $Id: head.S,v 1.2 2005/02/18 13:06:31 mike Exp $ * * Rescue code, made to reside at the beginning of the * flash-memory. when it starts, it checks a partition It's not catching all of the $Revision and $Date deltas. The output diff shouldn't contain any CVS keywords. It is somewhat tricky to catch all of the cases and fix up the diffs. This filter should get written and debugged once and then made part of something like git so that it doesn't get written over and over again. Perl is way better for this I had 1000 lines of C in my program and it was still missing 10% of the cases. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-23 14:44 ` Jon Smirl @ 2007-07-23 15:11 ` Simon 'corecode' Schubert 2007-07-23 20:11 ` Johannes Schindelin 1 sibling, 0 replies; 24+ messages in thread From: Simon 'corecode' Schubert @ 2007-07-23 15:11 UTC (permalink / raw) To: Jon Smirl; +Cc: Johannes Schindelin, Git Mailing List Jon Smirl wrote: > On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: >> Okay, I did not really test thoroughly, it seems. Sorry. Next try. > > It's working a lot better. People deal with $Id two different ways. > One way they delete all of the $Id lines, that is the case of the > Sonos patch. In the Phytec patch they left all of the $Id lines in > place which caused them to get modified. In both cases you just want > the lines with $Id to disappear in the patch. > > It doesn't catch the $Id case from the Phytec patch. > > diff -uarN linux-2.6.10/arch/cris/arch-v10/boot/rescue/head.S > linux-2.6.10-lpc3180/arch/cris/arch-v10/boot/rescue/head.S > --- linux-2.6.10/arch/cris/arch-v10/boot/rescue/head.S 2004-12-25 > 05:35:24.000000000 +0800 > +++ linux-2.6.10-lpc3180/arch/cris/arch-v10/boot/rescue/head.S > 2006-11-20 15:49:30.000000000 +0800 > @@ -1,4 +1,5 @@ > /* $Id: head.S,v 1.6 2003/04/09 08:12:43 pkj Exp $ > +/* $Id: head.S,v 1.2 2005/02/18 13:06:31 mike Exp $ > * > * Rescue code, made to reside at the beginning of the > * flash-memory. when it starts, it checks a partition > > It's not catching all of the $Revision and $Date deltas. > > The output diff shouldn't contain any CVS keywords. It is somewhat > tricky to catch all of the cases and fix up the diffs. This filter > should get written and debugged once and then made part of something > like git so that it doesn't get written over and over again. Perl is > way better for this I had 1000 lines of C in my program and it was > still missing 10% of the cases. Maybe I am missing something, but apart from $Log$, what's so hard about collapsing the CVS keywords? That's something like s/\$(Id|Date|Header|CVSHeader|Author|Revision):[^$]*\$/$\1$/ or not? I see that people want to modify the patch text to remove the churn, but that seems wrong. It would be much easier to just collapse the keywords before applying the patch, and then work on a newly created diff (from the git repo). cheers simon ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-23 14:44 ` Jon Smirl 2007-07-23 15:11 ` Simon 'corecode' Schubert @ 2007-07-23 20:11 ` Johannes Schindelin 2007-07-24 0:43 ` Jon Smirl 1 sibling, 1 reply; 24+ messages in thread From: Johannes Schindelin @ 2007-07-23 20:11 UTC (permalink / raw) To: Jon Smirl; +Cc: Git Mailing List Hi, On Mon, 23 Jul 2007, Jon Smirl wrote: > On 7/22/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > Okay, I did not really test thoroughly, it seems. Sorry. Next try. > > It's working a lot better. People deal with $Id two different ways. > One way they delete all of the $Id lines, that is the case of the > Sonos patch. In the Phytec patch they left all of the $Id lines in > place which caused them to get modified. In both cases you just want > the lines with $Id to disappear in the patch. > > It doesn't catch the $Id case from the Phytec patch. > > diff -uarN linux-2.6.10/arch/cris/arch-v10/boot/rescue/head.S > linux-2.6.10-lpc3180/arch/cris/arch-v10/boot/rescue/head.S > --- linux-2.6.10/arch/cris/arch-v10/boot/rescue/head.S 2004-12-25 > 05:35:24.000000000 +0800 > +++ linux-2.6.10-lpc3180/arch/cris/arch-v10/boot/rescue/head.S > 2006-11-20 15:49:30.000000000 +0800 > @@ -1,4 +1,5 @@ > /* $Id: head.S,v 1.6 2003/04/09 08:12:43 pkj Exp $ > +/* $Id: head.S,v 1.2 2005/02/18 13:06:31 mike Exp $ > * > * Rescue code, made to reside at the beginning of the > * flash-memory. when it starts, it checks a partition > > It's not catching all of the $Revision and $Date deltas. > > The output diff shouldn't contain any CVS keywords. It is somewhat > tricky to catch all of the cases and fix up the diffs. This filter > should get written and debugged once and then made part of something > like git so that it doesn't get written over and over again. Perl is > way better for this I had 1000 lines of C in my program and it was > still missing 10% of the cases. Next try. It includes documentation, skips $Source:$ and $Header:$ as well, and the output diff does not contain any CVS keywords. Ah yes, and if a file was actually created, it substitutes the --- file name with /dev/null. In that case, the keywords are skipped, too! However, when a file was deleted, they are _not_ skipped. Strictly speaking, it is wrong to just cut out the lines containing CVS keywords, since it is possible that there is an important change on that line, but we can cross that bridge once companies play such cute games with us. But at least I tested the resulting patch with your phytec3180 patch this time... Ciao, Dscho -- snipsnap -- #!/usr/bin/perl # This is a simple state machine. # # There is the state of the current file; its header is stored # in $current_file to avoid outputting it when all hunks were # culled. It is only printed before the first hunk, and then # set to "" to avoid outputting it twice. # # There are the states of the current hunk, stored in # * $current_hunk (possibly modified hunk) # * $start_minus, $start_plus (from the original) # * $plus, $minus, $space (the current count of the respective lines) # a hunk is only printed (in flush_hunk) if any '+' or '-' lines # are left after filtering. # # For $Log..$, there is the state $skip_logs, which is set to 1 # after seeing such a line, and set to 0 when the first line # was seen which does not begin with '+'. # # A particularly nasty special case is when a single "*" was # misattributed by the diff to be _inserted before_ a $Log, instead # of _appended after_ a $Log. # This is the purpose of the $before_log and $after_log variables: # if not empty, the state machine expects the next line to begin # with '+' or '-', respectively, and followed by a $Log. If this # expectation is not met, the variable is output. # # The variable $plus_minus_adjust contains the number of lines which # were skipped from the "+" side, so that the correct offset is shown. # This function gets a hunk header. # # It initializes the state variables described above sub init_hunk { $_ = $_[0]; $current_hunk = ""; $current_hunk_header = $_; ($start_minus, $dummy, $start_plus, $dummy) = /^\@\@ -(\d+)(,\d+|) \+(\d+)(,\d+|) \@\@/; $plus = $minus = $space = 0; $skip_logs = 0; $before_log = ''; $after_log = ''; # we prefer /dev/null as original file name when a file is new if ($start_minus eq 0) { $current_file =~ s/\n--- .*\n/\n--- \/dev\/null\n/; } elsif ($start_plus eq 0) { $current_file =~ s/\n\+\+\+ .*\n/\n+++ \/dev\/null\n/; } } # This function is called whenever there is possibly a hunk to print. # Nothing is printed if no '+' or '-' lines are left. # Otherwise, if the file header was not yet shown, it does so now. sub flush_hunk { if ($plus > 0 || $minus > 0) { if ($current_file ne "") { print $current_file; $current_file = ""; } $minus += $space; $plus += $space; print "\@\@ -$start_minus,$minus " . "+" . ($start_plus - $start_plus_adjust) . ",$plus \@\@\n"; print $current_hunk; } } # This adds a line to the current hunk and updates $space, $plus or $minus sub add_line { my $line = $_[0]; $current_hunk .= $line; if ($line =~ /^ /) { $space++; } elsif ($line =~ /^\+/) { $plus++; } elsif ($line =~ /^-/) { $minus++; } elsif ($line =~ /^\\/) { # do nothing } else { die "Unexpected line: $line"; } } # This function splits the current hunk into the part before the current # line, and the part after the current line. sub skip_line { my $line = $_[0]; if ($start_minus == 0) { # This patch adds a new file, just ignore that line return; } elsif ($start_plus == 0) { # This patch removes a file, so include the line nevertheless add_line $_; return; } flush_hunk; if ($line =~ /^-/) { $minus++; } elsif ($line =~ /^\+/) { $plus++; $start_plus_adjust++; } init_hunk "@@ -" . ($start_minus + $minus + $space) . " +" . ($start_plus + $plus + $space) . " @@\n"; } $simple_keyword = "Id|Revision|Author|Date|Source|Header"; # This is the main loop sub check_file { $_ = $_[0]; $current_file = $_; $start_plus_adjust = 0; while (<>) { if (/^\@\@/) { last; } $current_file .= $_; } init_hunk $_; # check hunks while (<>) { if ($before_log) { if (!/\+.*\$Log.*\$/) { add_line $before_log; } else { skip_line $before_log; } $before_log = ''; } if ($after_log) { if (!/-.*\$Log.*\$/) { add_line $after_log; } else { skip_line $after_log; } $after_log = ''; } if ($skip_logs) { if (/^\+/) { skip_line $_; $skip_logs = 1; } else { $skip_logs = 0; if (/^ *\*$/) { $after_log = $_; } } } elsif (/^\+.*\$($simple_keyword).*\$/) { skip_line $_; } elsif (/^\@\@.*/) { flush_hunk; init_hunk $_; } elsif (/^diff/) { flush_hunk; return; } elsif (/^-.*\$($simple_keyword).*\$/) { # fake new hunk skip_line $_; } elsif (/^\+ *\*$/) { $before_log = $_; } elsif (/^([- \+]).*\$Log.*\$/) { skip_line $_; $skip_logs = 1; } else { add_line $_; } } } # This loop just shows everything before the first diff, and then hands # over to check_file whenever it sees a line beginning with "diff". while (<>) { if (/^diff/) { do { check_file $_; } while(/^diff/); flush_hunk; } else { printf $_; } } ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-23 20:11 ` Johannes Schindelin @ 2007-07-24 0:43 ` Jon Smirl 2007-07-24 1:06 ` Jon Smirl 2007-07-24 1:16 ` Johannes Schindelin 0 siblings, 2 replies; 24+ messages in thread From: Jon Smirl @ 2007-07-24 0:43 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Git Mailing List Thanks for working on this. I'd like to see it added to git toolkit. This diff is way easier to read now. I can see that it has SPI support backported from some future version. But... it still has some problems. For the phytec patch it's not getting the $Log changes in the qlogic files right. I'm checking the output diff line by line for problems. It's down to 11,528 lines from 88,787. That's a lot of junk removed, I'll have to make sure it isn't removing too much. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-24 0:43 ` Jon Smirl @ 2007-07-24 1:06 ` Jon Smirl 2007-07-24 1:16 ` Johannes Schindelin 1 sibling, 0 replies; 24+ messages in thread From: Jon Smirl @ 2007-07-24 1:06 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Git Mailing List On 7/23/07, Jon Smirl <jonsmirl@gmail.com> wrote: > Thanks for working on this. I'd like to see it added to git toolkit. > This diff is way easier to read now. I can see that it has SPI support > backported from some future version. > > But... it still has some problems. > For the phytec patch it's not getting the $Log changes in the qlogic > files right. > > I'm checking the output diff line by line for problems. It's down to > 11,528 lines from 88,787. That's a lot of junk removed, I'll have to > make sure it isn't removing too much. I forgot about newly added files, looks like only 20,000 lines of CVS noise. The phytec patch has several ia64 files added to it, those are obvious not part of the support and an ARM9 processor. Never noticed those before. > > -- > Jon Smirl > jonsmirl@gmail.com > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-24 0:43 ` Jon Smirl 2007-07-24 1:06 ` Jon Smirl @ 2007-07-24 1:16 ` Johannes Schindelin 2007-07-24 10:16 ` Jakub Narebski 1 sibling, 1 reply; 24+ messages in thread From: Johannes Schindelin @ 2007-07-24 1:16 UTC (permalink / raw) To: Jon Smirl; +Cc: Git Mailing List Hi, On Mon, 23 Jul 2007, Jon Smirl wrote: > Thanks for working on this. I'd like to see it added to git toolkit. I plan to submit it to patchutils instead, since this is not really dependent on git. > This diff is way easier to read now. I can see that it has SPI support > backported from some future version. > > But... it still has some problems. For the phytec patch it's not getting > the $Log changes in the qlogic files right. You are correct. This slipped by my eyes. > I'm checking the output diff line by line for problems. It's down to > 11,528 lines from 88,787. That's a lot of junk removed, I'll have to > make sure it isn't removing too much. May I ask you to test this version? Thanks, Dscho -- snipsnap -- #!/usr/bin/perl # This is a simple state machine. # # There is the state of the current file; its header is stored # in $current_file to avoid outputting it when all hunks were # culled. It is only printed before the first hunk, and then # set to "" to avoid outputting it twice. # # There are the states of the current hunk, stored in # * $current_hunk (possibly modified hunk) # * $start_minus, $start_plus (from the original) # * $plus, $minus, $space (the current count of the respective lines) # a hunk is only printed (in flush_hunk) if any '+' or '-' lines # are left after filtering. # # For $Log..$, there is the state $skip_logs, which is set to 1 # after seeing such a line, and set to 0 when the first line # was seen which does not begin with '+'. # # A particularly nasty special case is when a single "*" was # misattributed by the diff to be _inserted before_ a $Log, instead # of _appended after_ a $Log. # This is the purpose of the $before_log and $after_log variables: # if not empty, the state machine expects the next line to begin # with '+' or '-', respectively, and followed by a $Log. If this # expectation is not met, the variable is output. # # The variable $plus_minus_adjust contains the number of lines which # were skipped from the "+" side, so that the correct offset is shown. # This function gets a hunk header. # # It initializes the state variables described above sub init_hunk { my $line = $_[0]; $current_hunk = ""; ($start_minus, $dummy, $start_plus, $dummy) = ($line =~ /^\@\@ -(\d+)(,\d+|) \+(\d+)(,\d+|) \@\@/); $plus = $minus = $space = 0; $skip_logs = 0; $before_log = ''; $after_log = ''; # we prefer /dev/null as original file name when a file is new if ($start_minus eq 0) { $current_file =~ s/\n--- .*\n/\n--- \/dev\/null\n/; } elsif ($start_plus eq 0) { $current_file =~ s/\n\+\+\+ .*\n/\n+++ \/dev\/null\n/; } } # This function is called whenever there is possibly a hunk to print. # Nothing is printed if no '+' or '-' lines are left. # Otherwise, if the file header was not yet shown, it does so now. sub flush_hunk { if (($plus > 0 || $minus > 0) && $current_hunk ne '') { if ($current_file ne "") { print $current_file; $current_file = ""; } $minus += $space; $plus += $space; print "\@\@ -$start_minus,$minus " . "+" . ($start_plus - $start_plus_adjust) . ",$plus \@\@\n"; print $current_hunk; $current_hunk = ''; } } # This adds a line to the current hunk and updates $space, $plus or $minus sub add_line { my $line = $_[0]; $current_hunk .= $line; if ($line =~ /^ /) { $space++; } elsif ($line =~ /^\+/) { $plus++; } elsif ($line =~ /^-/) { $minus++; } elsif ($line =~ /^\\/) { # do nothing } else { die "Unexpected line: $line"; } } # This function splits the current hunk into the part before the current # line, and the part after the current line. sub skip_line { my $line = $_[0]; if ($start_minus == 0) { # This patch adds a new file, just ignore that line return; } elsif ($start_plus == 0) { # This patch removes a file, so include the line nevertheless add_line $_; return; } flush_hunk; if ($line =~ /^-/) { $minus++; } elsif ($line =~ /^\+/) { $plus++; $start_plus_adjust++; } init_hunk "@@ -" . ($start_minus + $minus + $space) . " +" . ($start_plus + $plus + $space) . " @@\n"; } $simple_keyword = "Id|Revision|Author|Date|Source|Header"; # This is the main loop sub check_file { $_ = $_[0]; $current_file = $_; $start_plus_adjust = 0; while (<>) { if (/^\@\@/) { last; } $current_file .= $_; } init_hunk $_; # check hunks while (<>) { if ($before_log) { if (!/\+.*\$Log.*\$/) { add_line $before_log; } else { skip_line $before_log; } $before_log = ''; } if ($after_log) { if (!/-.*\$Log.*\$/) { add_line $after_log; } else { skip_line $after_log; } $after_log = ''; } if ($skip_logs) { if (/^\+/) { skip_line $_; $skip_logs = 1; } else { $skip_logs = 0; if (/^ *\*$/) { $after_log = $_; } } } elsif (/^\+.*\$($simple_keyword).*\$/) { skip_line $_; } elsif (/^\@\@.*/) { flush_hunk; init_hunk $_; } elsif (/^diff/) { flush_hunk; return; } elsif (/^-.*\$($simple_keyword).*\$/) { # fake new hunk skip_line $_; } elsif (/^\+ *\*$/) { $before_log = $_; } elsif (/^([- \+]).*\$Log.*\$/) { skip_line $_; $skip_logs = 1; } else { add_line $_; } } } # This loop just shows everything before the first diff, and then hands # over to check_file whenever it sees a line beginning with "diff". while (<>) { if (/^diff/) { do { check_file $_; } while(/^diff/); } else { printf $_; } } flush_hunk; ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-24 1:16 ` Johannes Schindelin @ 2007-07-24 10:16 ` Jakub Narebski 2007-07-24 11:26 ` Johannes Schindelin 0 siblings, 1 reply; 24+ messages in thread From: Jakub Narebski @ 2007-07-24 10:16 UTC (permalink / raw) To: git [Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>, git@vger.kernel.org] Johannes Schindelin wrote: > On Mon, 23 Jul 2007, Jon Smirl wrote: > >> Thanks for working on this. I'd like to see it added to git toolkit. > > I plan to submit it to patchutils instead, since this is not really > dependent on git. Could you also add it to contrib/ area, please? -- Jakub Narebski Warsaw, Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-24 10:16 ` Jakub Narebski @ 2007-07-24 11:26 ` Johannes Schindelin 2007-07-24 13:08 ` Jon Smirl 0 siblings, 1 reply; 24+ messages in thread From: Johannes Schindelin @ 2007-07-24 11:26 UTC (permalink / raw) To: Jakub Narebski; +Cc: git, Jon Smirl Hi, On Tue, 24 Jul 2007, Jakub Narebski wrote: > [Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>, git@vger.kernel.org] Thanks. > Johannes Schindelin wrote: > > > On Mon, 23 Jul 2007, Jon Smirl wrote: > > > >> Thanks for working on this. I'd like to see it added to git toolkit. > > > > I plan to submit it to patchutils instead, since this is not really > > dependent on git. > > Could you also add it to contrib/ area, please? Sure. Once it is really all fleshed out. But I really think that patchutils is a better place. That way, the script also helps the poor souls stuck with Mercurial or Darcs :-) The really tedious part is the testing, and the verifying. Fortunately, Jon made up for my incapability in both areas, with an incredible patience. Ciao, Dscho ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-24 11:26 ` Johannes Schindelin @ 2007-07-24 13:08 ` Jon Smirl 0 siblings, 0 replies; 24+ messages in thread From: Jon Smirl @ 2007-07-24 13:08 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Jakub Narebski, git On 7/24/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > The really tedious part is the testing, and the verifying. Fortunately, > Jon made up for my incapability in both areas, with an incredible > patience. I haven't finished checking every last line yet, if anything is left it is small. We lost power here all evening last night. That's not supposed to happen in urban Boston. It was black over a mile from me in all directions. > > Ciao, > Dscho > > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-22 18:48 Git help for kernel archeology, suppress diffs caused by CVS keyword expansion Jon Smirl 2007-07-22 19:00 ` Johannes Schindelin @ 2007-07-22 19:09 ` Linus Torvalds 2007-07-22 19:12 ` Jon Smirl 2007-07-22 19:17 ` David Kastrup 2007-07-22 19:47 ` Jan Engelhardt 2 siblings, 2 replies; 24+ messages in thread From: Linus Torvalds @ 2007-07-22 19:09 UTC (permalink / raw) To: Jon Smirl; +Cc: Git Mailing List, lkml On Sun, 22 Jul 2007, Jon Smirl wrote: > > It would really be useful if git diff had an option for suppressing > diffs caused by CVS keyword expansion. I really think it's not a "git diff" issue, but it might be a "import" issue. IOW, I think you'd be a *lot* better off just not importing those things in the first place (which is what CVS does internally), or possibly importing them as two trees (ie you'd have the "non-log" version and the "log expansion" version, so that you can track and compare both). Doing the thing at "diff" time is certainly possible, but this is simply much better done as a totally independent preprocessing phase. The diff handling is already some of the more complex parts (and very central), it would be much simpler and efficient to not try to make that thing fancier, and instead solve the problem at the front-end. Linus ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-22 19:09 ` Linus Torvalds @ 2007-07-22 19:12 ` Jon Smirl 2007-07-22 19:37 ` Linus Torvalds 2007-07-22 19:17 ` David Kastrup 1 sibling, 1 reply; 24+ messages in thread From: Jon Smirl @ 2007-07-22 19:12 UTC (permalink / raw) To: Linus Torvalds; +Cc: Git Mailing List, lkml On 7/22/07, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Sun, 22 Jul 2007, Jon Smirl wrote: > > > > It would really be useful if git diff had an option for suppressing > > diffs caused by CVS keyword expansion. > > I really think it's not a "git diff" issue, but it might be a "import" > issue. > > IOW, I think you'd be a *lot* better off just not importing those things > in the first place (which is what CVS does internally), or possibly > importing them as two trees (ie you'd have the "non-log" version and the > "log expansion" version, so that you can track and compare both). > > Doing the thing at "diff" time is certainly possible, but this is simply > much better done as a totally independent preprocessing phase. The diff > handling is already some of the more complex parts (and very central), it > would be much simpler and efficient to not try to make that thing fancier, > and instead solve the problem at the front-end. These diffs are coming from companies doing GPL compliance without really wanting to comply. CVS servers are not made available. > > Linus > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-22 19:12 ` Jon Smirl @ 2007-07-22 19:37 ` Linus Torvalds 0 siblings, 0 replies; 24+ messages in thread From: Linus Torvalds @ 2007-07-22 19:37 UTC (permalink / raw) To: Jon Smirl; +Cc: Git Mailing List, lkml On Sun, 22 Jul 2007, Jon Smirl wrote: > On 7/22/07, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > > > > On Sun, 22 Jul 2007, Jon Smirl wrote: > > > > > > It would really be useful if git diff had an option for suppressing > > > diffs caused by CVS keyword expansion. > > > > I really think it's not a "git diff" issue, but it might be a "import" > > issue. > > > > IOW, I think you'd be a *lot* better off just not importing those things > > in the first place (which is what CVS does internally), or possibly > > importing them as two trees (ie you'd have the "non-log" version and the > > "log expansion" version, so that you can track and compare both). > > > > Doing the thing at "diff" time is certainly possible, but this is simply > > much better done as a totally independent preprocessing phase. The diff > > handling is already some of the more complex parts (and very central), it > > would be much simpler and efficient to not try to make that thing fancier, > > and instead solve the problem at the front-end. > > These diffs are coming from companies doing GPL compliance without > really wanting to comply. CVS servers are not made available. That wasn't what I said. You want to supporess the CVS keyword expansion. I'm telling you that you should just do so. But I'm *also* telling you that this has nothing to do with "git diff". The way to get "git diff" to not show the CVS expansion is to *remove* the CVS expansion. By simply running some pre-processing on the patches *before* you put them into git in the first place (or, like Dscho suggested: you could do it later too, by using git filter-branch). Once you have the version that doesn't have the CVS log, "git diff" will automatically do the rigth thing. In other words, I'm just saying that you're trying to solve the wrong problem. Once you solve the *right* problem, the wrong problem just goes away. Linus ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-22 19:09 ` Linus Torvalds 2007-07-22 19:12 ` Jon Smirl @ 2007-07-22 19:17 ` David Kastrup 1 sibling, 0 replies; 24+ messages in thread From: David Kastrup @ 2007-07-22 19:17 UTC (permalink / raw) To: Linus Torvalds; +Cc: Jon Smirl, Git Mailing List, lkml Linus Torvalds <torvalds@linux-foundation.org> writes: > On Sun, 22 Jul 2007, Jon Smirl wrote: >> >> It would really be useful if git diff had an option for suppressing >> diffs caused by CVS keyword expansion. > > I really think it's not a "git diff" issue, but it might be a "import" > issue. > > IOW, I think you'd be a *lot* better off just not importing those > things in the first place (which is what CVS does internally), or > possibly importing them as two trees (ie you'd have the "non-log" > version and the "log expansion" version, so that you can track and > compare both). One problem is that those strings more often than not are involved in some magic computation, like placing version and date information into extracted output. While an import of the unexpanded version is really the sanest option, it might render the resulting code inoperable. CVS diff itself reports those differences, too (and it makes for a good quota of merge problems even in CVS), so it is not like we are in bad company here. Uh, strike that. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: Git help for kernel archeology, suppress diffs caused by CVS keyword expansion 2007-07-22 18:48 Git help for kernel archeology, suppress diffs caused by CVS keyword expansion Jon Smirl 2007-07-22 19:00 ` Johannes Schindelin 2007-07-22 19:09 ` Linus Torvalds @ 2007-07-22 19:47 ` Jan Engelhardt 2 siblings, 0 replies; 24+ messages in thread From: Jan Engelhardt @ 2007-07-22 19:47 UTC (permalink / raw) To: Jon Smirl; +Cc: Git Mailing List, lkml On Jul 22 2007 14:48, Jon Smirl wrote: > > It would really be useful if git diff had an option for suppressing > diffs caused by CVS keyword expansion. I run into this problem over > and over when trying to recover stuff out of old kernel sources that > people checked into CVS and then posted CVS diffs to fulfill their GPL > obligations. I sometimes wonder if vendors are doing this on purpose > to make it more difficult to recover the changes they made to the > code. > > To prevent this in the future, I'd love to see a patch removing all > CVS keywords from the kernel sources. I say: yes please. Even svn (which Linus dislikes very much) has cvs keywords disabled by default. And at the same time, a patch to CodingStyle not to include CVS tags, and a patch to checkpatch.pl to enforce it :) > Quick grep of kernel shows 1,535 $Id, 197 $Log, 441 $Revision, 144 $Date. I > guestimate that this would remove about 5,000 lines form the kernel source > and touch 1,700 files. Would this be accept or do those expansions contain > useful info? Common argument for $Id$ tags has been, that upon a problem, a user can look into dmesg (or whatever log is appropraite - it's not limited to the kernel) for the version string instead of figuring it out himself (by means or rpm -q, or whatever hides it). For the Linux kernel however, I do not think it is important, since: - the code of a module changes faster than you can think (e.g. a change in the network code touched a network filesystem). Result: same $Id$ tag in the vendor's version AND the mainline version. So the tag is useless. - you know what version you are running, mingetty usually prints it for you; uname -a also shows it. - if you have a problem, people kindly ask you to bisect anyway, in which case $Id$ numbers become irrelevant anyway, because you got the git SHA1 ids. Yep, that's pretty much it from my side. Conclusion: Remove CVS tags from kernel. Jan -- ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2007-07-24 13:08 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-07-22 18:48 Git help for kernel archeology, suppress diffs caused by CVS keyword expansion Jon Smirl 2007-07-22 19:00 ` Johannes Schindelin 2007-07-22 19:10 ` Jon Smirl 2007-07-22 19:15 ` Johannes Schindelin 2007-07-22 19:48 ` Jon Smirl 2007-07-22 21:39 ` Johannes Schindelin 2007-07-22 23:45 ` Jon Smirl 2007-07-22 23:50 ` Johannes Schindelin 2007-07-23 0:11 ` Jon Smirl 2007-07-23 0:37 ` Johannes Schindelin 2007-07-23 14:44 ` Jon Smirl 2007-07-23 15:11 ` Simon 'corecode' Schubert 2007-07-23 20:11 ` Johannes Schindelin 2007-07-24 0:43 ` Jon Smirl 2007-07-24 1:06 ` Jon Smirl 2007-07-24 1:16 ` Johannes Schindelin 2007-07-24 10:16 ` Jakub Narebski 2007-07-24 11:26 ` Johannes Schindelin 2007-07-24 13:08 ` Jon Smirl 2007-07-22 19:09 ` Linus Torvalds 2007-07-22 19:12 ` Jon Smirl 2007-07-22 19:37 ` Linus Torvalds 2007-07-22 19:17 ` David Kastrup 2007-07-22 19:47 ` Jan Engelhardt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).