git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] git-cvsserver: Add support for some binary files
@ 2008-05-15  4:35 Matthew Ogilvie
  2008-05-15  4:35 ` [PATCH 1/3] git-cvsserver: add mechanism for managing working tree and current directory Matthew Ogilvie
  2008-05-17  0:03 ` [PATCH 0/3] git-cvsserver: Add support for some binary files Junio C Hamano
  0 siblings, 2 replies; 11+ messages in thread
From: Matthew Ogilvie @ 2008-05-15  4:35 UTC (permalink / raw)
  To: git; +Cc: Matthew Ogilvie

This series of patches extends git-cvsserver to support telling the
CVS client to set the -kb (binary) mode for files that git considers
to be binary (and not for text files).  It includes updates to
documentation and tests.

By default the new binary support is not enabled.  To enable it,
you should set "gitcvs.usecrlfattr" and "gitcvs.allbinary=guess",
as described in the updated documentation.

-----------------

This patch series is usable now, but there are some things I'm not
sure about, and things that could still use improvement:

1. As currently implemented, the second patch (for checking file
attributes) forks a separate instance of git-check-attr for every
file it needs to look up in the repository.  Each invocation involves
reading the index file, so things may get kind of slow if
there are a whole lot of files in the repository.  It might be
worth reorganizing things so that it can ask about multiple
files in one invocation of git-check-attr, but such a change would
probably be invasive enough to warrant a separate patch.

2. Is there a better/more intuitive way of configuring this?  Perhaps
"gitcvs.autocrlf" that is similar to "core.autocrlf"?  But it seems
unfriendly to drop default and "gitcvs.allbinary" modes; some
users may have set things up such that those modes are needed.

3. I'm not sure about the best way to handle repeatably changing
current directory.  The first patch tries to make a somewhat general
mechanism to manage it, but I keep thinking in the back of my mind
that it might be better to set up a working directory first thing,
and then minimize any further directory changes after that.  Does
anyone have any thoughts about this?

4. Possibly additional enhancements including:
a. Strip out '\r' from "text" files, so when the CVS client
adds '\r', you don't wind up with double '\r's per line.
b. Additional conversions like in convert.c, done on server side.
Including safecrlf, smudge/clean filters, etc.
c. If a new .gitattributes file is sent by the client, use it
in preference over the one from the most recent commit.  As it
is now, a user might need to commit the new .gitattributes before
committing anything else.  This might be much easier if a new
overall design for setting up and using a working directory was
used (see above).

5. It might make things clearer to refactor the special case
transmitfile() modes to be implemented as separate functions that
use open_blob_or_die().  Probably a separate patch, if done at all.

6. Additional tweaks to the documentation?  For example, should
there be a note on "core.autocrlf" that binary support in emulation
tools may use other configuration variables...

Matthew Ogilvie (3):
      git-cvsserver: add mechanism for managing working tree and current directory
      implement gitcvs.usecrlfattr
      git-cvsserver: add ability to guess -kb from contents

 Documentation/config.txt        |   26 ++-
 Documentation/git-cvsserver.txt |   32 ++-
 git-cvsserver.perl              |  500 ++++++++++++++++++++++++++++++++++-----
 t/t9401-git-cvsserver-crlf.sh   |  337 ++++++++++++++++++++++++++
 4 files changed, 826 insertions(+), 69 deletions(-)
 create mode 100755 t/t9401-git-cvsserver-crlf.sh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] git-cvsserver: add mechanism for managing working tree and current directory
  2008-05-15  4:35 [PATCH 0/3] git-cvsserver: Add support for some binary files Matthew Ogilvie
@ 2008-05-15  4:35 ` Matthew Ogilvie
  2008-05-15  4:35   ` [PATCH 2/3] implement gitcvs.usecrlfattr Matthew Ogilvie
  2008-05-17  0:03 ` [PATCH 0/3] git-cvsserver: Add support for some binary files Junio C Hamano
  1 sibling, 1 reply; 11+ messages in thread
From: Matthew Ogilvie @ 2008-05-15  4:35 UTC (permalink / raw)
  To: git; +Cc: Matthew Ogilvie

There are various reasons git-cvsserver needs to manipulate the current
directory, and this patch attempts to clarify and validate such changes:

1. Temporary empty working directory (with index) for certain operations
   that require an index file to work.
2. Use a temporary directory with temporary file names for doing
   merges of user's dirty sandbox state with latest changes in
   repository.
3. Coming up soon: Set up an index and either a valid or empty
   working directory when calling git-check-attr to decide
   if a file should be marked binary (-kb).

Signed-off-by: Matthew Ogilvie <mmogilvi_git@miniinfo.net>
---

I'm not sure about this.  I get the vague sense it might be better to
just always set up a (usually empty except for the index file and
user's changed files) working directory early in processing, and
minimize chdir calls after that.  It might also make it
possible to have a new .gitattributes file take effect immediately.
But that would be a much more invasive change.

 git-cvsserver.perl |  252 ++++++++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 213 insertions(+), 39 deletions(-)

diff --git a/git-cvsserver.perl b/git-cvsserver.perl
index 29dbfc9..674892b 100755
--- a/git-cvsserver.perl
+++ b/git-cvsserver.perl
@@ -21,6 +21,7 @@ use bytes;
 
 use Fcntl;
 use File::Temp qw/tempdir tempfile/;
+use File::Path qw/rmtree/;
 use File::Basename;
 use Getopt::Long qw(:config require_order no_ignore_case);
 
@@ -86,6 +87,17 @@ my $methods = {
 # $state holds all the bits of information the clients sends us that could
 # potentially be useful when it comes to actually _doing_ something.
 my $state = { prependdir => '' };
+
+# Work is for managing temporary working directory
+my $work =
+    {
+        state => undef,  # undef, 1 (empty), 2 (with stuff)
+        workDir => undef,
+        index => undef,
+        emptyDir => undef,
+        tmpDir => undef
+    };
+
 $log->info("--------------- STARTING -----------------");
 
 my $usage =
@@ -189,6 +201,9 @@ while (<STDIN>)
 $log->debug("Processing time : user=" . (times)[0] . " system=" . (times)[1]);
 $log->info("--------------- FINISH -----------------");
 
+chdir '/';
+exit 0;
+
 # Magic catchall method.
 #    This is the method that will handle all commands we haven't yet
 #    implemented. It simply sends a warning to the log file indicating a
@@ -1101,10 +1116,10 @@ sub req_update
             $log->info("Updating '$filename'");
             my ( $filepart, $dirpart ) = filenamesplit($meta->{name},1);
 
-            my $dir = tempdir( DIR => $TEMP_DIR, CLEANUP => 1 ) . "/";
+            my $mergeDir = setupTmpDir();
 
-            chdir $dir;
             my $file_local = $filepart . ".mine";
+            my $mergedFile = "$mergeDir/$file_local";
             system("ln","-s",$state->{entries}{$filename}{modified_filename}, $file_local);
             my $file_old = $filepart . "." . $oldmeta->{revision};
             transmitfile($oldmeta->{filehash}, { targetfile => $file_old });
@@ -1115,11 +1130,13 @@ sub req_update
             $log->info("Merging $file_local, $file_old, $file_new");
             print "M Merging differences between 1.$oldmeta->{revision} and 1.$meta->{revision} into $filename\n";
 
-            $log->debug("Temporary directory for merge is $dir");
+            $log->debug("Temporary directory for merge is $mergeDir");
 
             my $return = system("git", "merge-file", $file_local, $file_old, $file_new);
             $return >>= 8;
 
+            cleanupTmpDir();
+
             if ( $return == 0 )
             {
                 $log->info("Merged successfully");
@@ -1168,13 +1185,11 @@ sub req_update
                 # transmit file, format is single integer on a line by itself (file
                 # size) followed by the file contents
                 # TODO : we should copy files in blocks
-                my $data = `cat $file_local`;
+                my $data = `cat $mergedFile`;
                 $log->debug("File size : " . length($data));
                 print length($data) . "\n";
                 print $data;
             }
-
-            chdir "/";
         }
 
     }
@@ -1195,6 +1210,7 @@ sub req_ci
     if ( $state->{method} eq 'pserver')
     {
         print "error 1 pserver access cannot commit\n";
+        cleanupWorkTree();
         exit;
     }
 
@@ -1202,6 +1218,7 @@ sub req_ci
     {
         $log->warn("file 'index' already exists in the git repository");
         print "error 1 Index already exists in git repo\n";
+        cleanupWorkTree();
         exit;
     }
 
@@ -1209,31 +1226,20 @@ sub req_ci
     my $updater = GITCVS::updater->new($state->{CVSROOT}, $state->{module}, $log);
     $updater->update();
 
-    my $tmpdir = tempdir ( DIR => $TEMP_DIR );
-    my ( undef, $file_index ) = tempfile ( DIR => $TEMP_DIR, OPEN => 0 );
-    $log->info("Lockless commit start, basing commit on '$tmpdir', index file is '$file_index'");
-
-    $ENV{GIT_DIR} = $state->{CVSROOT} . "/";
-    $ENV{GIT_WORK_TREE} = ".";
-    $ENV{GIT_INDEX_FILE} = $file_index;
-
     # Remember where the head was at the beginning.
     my $parenthash = `git show-ref -s refs/heads/$state->{module}`;
     chomp $parenthash;
     if ($parenthash !~ /^[0-9a-f]{40}$/) {
 	    print "error 1 pserver cannot find the current HEAD of module";
+	    cleanupWorkTree();
 	    exit;
     }
 
-    chdir $tmpdir;
+    setupWorkTree($parenthash);
 
-    # populate the temporary index
-    system("git-read-tree", $parenthash);
-    unless ($? == 0)
-    {
-	die "Error running git-read-tree $state->{module} $file_index $!";
-    }
-    $log->info("Created index '$file_index' for head $state->{module} - exit status $?");
+    $log->info("Lockless commit start, basing commit on '$work->{workDir}', index file is '$work->{index}'");
+
+    $log->info("Created index '$work->{index}' for head $state->{module} - exit status $?");
 
     my @committedfiles = ();
     my %oldmeta;
@@ -1271,7 +1277,7 @@ sub req_ci
         {
             # fail everything if an up to date check fails
             print "error 1 Up to date check failed for $filename\n";
-            chdir "/";
+            cleanupWorkTree();
             exit;
         }
 
@@ -1313,7 +1319,7 @@ sub req_ci
     {
         print "E No files to commit\n";
         print "ok\n";
-        chdir "/";
+        cleanupWorkTree();
         return;
     }
 
@@ -1336,7 +1342,7 @@ sub req_ci
     {
         $log->warn("Commit failed (Invalid commit hash)");
         print "error 1 Commit failed (unknown reason)\n";
-        chdir "/";
+        cleanupWorkTree();
         exit;
     }
 
@@ -1348,7 +1354,7 @@ sub req_ci
 		{
 			$log->warn("Commit failed (update hook declined to update ref)");
 			print "error 1 Commit failed (update hook declined)\n";
-			chdir "/";
+			cleanupWorkTree();
 			exit;
 		}
 	}
@@ -1358,6 +1364,7 @@ sub req_ci
 			"refs/heads/$state->{module}", $commithash, $parenthash)) {
 		$log->warn("update-ref for $state->{module} failed.");
 		print "error 1 Cannot commit -- update first\n";
+		cleanupWorkTree();
 		exit;
 	}
 
@@ -1414,7 +1421,7 @@ sub req_ci
         }
     }
 
-    chdir "/";
+    cleanupWorkTree();
     print "ok\n";
 }
 
@@ -1757,15 +1764,9 @@ sub req_annotate
     argsfromdir($updater);
 
     # we'll need a temporary checkout dir
-    my $tmpdir = tempdir ( DIR => $TEMP_DIR );
-    my ( undef, $file_index ) = tempfile ( DIR => $TEMP_DIR, OPEN => 0 );
-    $log->info("Temp checkoutdir creation successful, basing annotate session work on '$tmpdir', index file is '$file_index'");
-
-    $ENV{GIT_DIR} = $state->{CVSROOT} . "/";
-    $ENV{GIT_WORK_TREE} = ".";
-    $ENV{GIT_INDEX_FILE} = $file_index;
+    setupWorkTree();
 
-    chdir $tmpdir;
+    $log->info("Temp checkoutdir creation successful, basing annotate session work on '$work->{workDir}', index file is '$ENV{GIT_INDEX_FILE}'");
 
     # foreach file specified on the command line ...
     foreach my $filename ( @{$state->{args}} )
@@ -1789,10 +1790,10 @@ sub req_annotate
 	system("git-read-tree", $lastseenin);
 	unless ($? == 0)
 	{
-	    print "E error running git-read-tree $lastseenin $file_index $!\n";
+	    print "E error running git-read-tree $lastseenin $ENV{GIT_INDEX_FILE} $!\n";
 	    return;
 	}
-	$log->info("Created index '$file_index' with commit $lastseenin - exit status $?");
+	$log->info("Created index '$ENV{GIT_INDEX_FILE}' with commit $lastseenin - exit status $?");
 
         # do a checkout of the file
         system('git-checkout-index', '-f', '-u', $filename);
@@ -1808,7 +1809,7 @@ sub req_annotate
         # git-jsannotate telling us about commits we are hiding
         # from the client.
 
-        my $a_hints = "$tmpdir/.annotate_hints";
+        my $a_hints = "$work->{workDir}/.annotate_hints";
         if (!open(ANNOTATEHINTS, '>', $a_hints)) {
             print "E failed to open '$a_hints' for writing: $!\n";
             return;
@@ -1862,7 +1863,7 @@ sub req_annotate
     }
 
     # done; get out of the tempdir
-    chdir "/";
+    cleanupWorkDir();
 
     print "ok\n";
 
@@ -2115,6 +2116,179 @@ sub filecleanup
     return $filename;
 }
 
+sub validateGitDir
+{
+    if( !defined($state->{CVSROOT}) )
+    {
+        print "error 1 CVSROOT not specified\n";
+        cleanupWorkTree();
+        exit;
+    }
+    if( $ENV{GIT_DIR} ne ($state->{CVSROOT} . '/') )
+    {
+        print "error 1 Internally inconsistent CVSROOT\n";
+        cleanupWorkTree();
+        exit;
+    }
+}
+
+# Setup working directory in a work tree with the requested version
+# loaded in the index.
+sub setupWorkTree
+{
+    my ($ver) = @_;
+
+    validateGitDir();
+
+    if( ( defined($work->{state}) && $work->{state} != 1 ) ||
+        defined($work->{tmpDir}) )
+    {
+        $log->warn("Bad work tree state management");
+        print "error 1 Internal setup multiple work trees without cleanup\n";
+        cleanupWorkTree();
+        exit;
+    }
+
+    $work->{workDir} = tempdir ( DIR => $TEMP_DIR );
+
+    if( !defined($work->{index}) )
+    {
+        (undef, $work->{index}) = tempfile ( DIR => $TEMP_DIR, OPEN => 0 );
+    }
+
+    chdir $work->{workDir} or
+        die "Unable to chdir to $work->{workDir}\n";
+
+    $log->info("Setting up GIT_WORK_TREE as '.' in '$work->{workDir}', index file is '$work->{index}'");
+
+    $ENV{GIT_WORK_TREE} = ".";
+    $ENV{GIT_INDEX_FILE} = $work->{index};
+    $work->{state} = 2;
+
+    if($ver)
+    {
+        system("git","read-tree",$ver);
+        unless ($? == 0)
+        {
+            $log->warn("Error running git-read-tree");
+            die "Error running git-read-tree $ver in $work->{workDir} $!\n";
+        }
+    }
+    # else # req_annotate reads tree for each file
+}
+
+# Ensure current directory is in some kind of working directory,
+# with a recent version loaded in the index.
+sub ensureWorkTree
+{
+    if( defined($work->{tmpDir}) )
+    {
+        $log->warn("Bad work tree state management [ensureWorkTree()]");
+        print "error 1 Internal setup multiple dirs without cleanup\n";
+        cleanupWorkTree();
+        exit;
+    }
+    if( $work->{state} )
+    {
+        return;
+    }
+
+    validateGitDir();
+
+    if( !defined($work->{emptyDir}) )
+    {
+        $work->{emptyDir} = tempdir ( DIR => $TEMP_DIR, OPEN => 0);
+    }
+    chdir $work->{emptyDir} or
+        die "Unable to chdir to $work->{emptyDir}\n";
+
+    my $ver = `git show-ref -s refs/heads/$state->{module}`;
+    chomp $ver;
+    if ($ver !~ /^[0-9a-f]{40}$/)
+    {
+        $log->warn("Error from git show-ref -s refs/head$state->{module}");
+        print "error 1 cannot find the current HEAD of module";
+        cleanupWorkTree();
+        exit;
+    }
+
+    if( !defined($work->{index}) )
+    {
+        (undef, $work->{index}) = tempfile ( DIR => $TEMP_DIR, OPEN => 0 );
+    }
+
+    $ENV{GIT_WORK_TREE} = ".";
+    $ENV{GIT_INDEX_FILE} = $work->{index};
+    $work->{state} = 1;
+
+    system("git","read-tree",$ver);
+    unless ($? == 0)
+    {
+        die "Error running git-read-tree $ver $!\n";
+    }
+}
+
+# Cleanup working directory that is not needed any longer.
+sub cleanupWorkTree
+{
+    if( ! $work->{state} )
+    {
+        return;
+    }
+
+    chdir "/" or die "Unable to chdir '/'\n";
+
+    if( defined($work->{workDir}) )
+    {
+        rmtree( $work->{workDir} );
+        undef $work->{workDir};
+    }
+    undef $work->{state};
+}
+
+# Setup a temporary directory (not a working tree), typically for
+# merging dirty state as in req_update.
+sub setupTmpDir
+{
+    $work->{tmpDir} = tempdir ( DIR => $TEMP_DIR );
+    chdir $work->{tmpDir} or die "Unable to chdir $work->{tmpDir}\n";
+
+    return $work->{tmpDir};
+}
+
+# Clean up a previously setupTmpDir.  Restore previous work tree if
+# appropriate.
+sub cleanupTmpDir
+{
+    if ( !defined($work->{tmpDir}) )
+    {
+        $log->warn("cleanup tmpdir that has not been setup");
+        die "Cleanup tmpDir that has not been setup\n";
+    }
+    if( defined($work->{state}) )
+    {
+        if( $work->{state} == 1 )
+        {
+            chdir $work->{emptyDir} or
+                die "Unable to chdir to $work->{emptyDir}\n";
+        }
+        elsif( $work->{state} == 2 )
+        {
+            chdir $work->{workDir} or
+                die "Unable to chdir to $work->{emptyDir}\n";
+        }
+        else
+        {
+            $log->warn("Inconsistent work dir state");
+            die "Inconsistent work dir state\n";
+        }
+    }
+    else
+    {
+        chdir "/" or die "Unable to chdir '/'\n";
+    }
+}
+
 # Given a path, this function returns a string containing the kopts
 # that should go into that path's Entries line.  For example, a binary
 # file should get -kb.
-- 
1.5.4.3.340.g97b97

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] implement gitcvs.usecrlfattr
  2008-05-15  4:35 ` [PATCH 1/3] git-cvsserver: add mechanism for managing working tree and current directory Matthew Ogilvie
@ 2008-05-15  4:35   ` Matthew Ogilvie
  2008-05-15  4:35     ` [PATCH 3/3] git-cvsserver: add ability to guess -kb from contents Matthew Ogilvie
  0 siblings, 1 reply; 11+ messages in thread
From: Matthew Ogilvie @ 2008-05-15  4:35 UTC (permalink / raw)
  To: git; +Cc: Matthew Ogilvie

If gitcvs.usecrlfattr is set to true, git-cvsserver will consult
the "crlf" for each file to determine if it should mark the file
as binary (-kb).

Signed-off-by: Matthew Ogilvie <mmogilvi_git@miniinfo.net>
---

There may be a performance issue with using a separate invocation of
git-check-attr for every file.  Perhaps an additional patch is needed
to reorganize things to check multiple files in one invocation.

 Documentation/config.txt        |   23 ++++--
 Documentation/git-cvsserver.txt |   26 +++++-
 git-cvsserver.perl              |   71 +++++++++++++---
 t/t9401-git-cvsserver-crlf.sh   |  178 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 276 insertions(+), 22 deletions(-)
 create mode 100755 t/t9401-git-cvsserver-crlf.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index a6fc5a2..820795f 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -645,11 +645,21 @@ gitcvs.logfile::
 	Path to a log file where the CVS server interface well... logs
 	various stuff. See linkgit:git-cvsserver[1].
 
+gitcvs.usecrlfattr
+	If true, the server will look up the `crlf` attribute for
+	files to determine the '-k' modes to use. If `crlf` is set,
+	the '-k' mode will be left blank, so cvs clients will
+	treat it as text. If `crlf` is explicitly unset, the file
+	will be set with '-kb' mode, which supresses any newline munging
+	the client might otherwise do. If `crlf` is not specified,
+	then 'gitcvs.allbinary' is used. See linkgit:gitattribute[5].
+
 gitcvs.allbinary::
-	If true, all files are sent to the client in mode '-kb'. This
-	causes the client to treat all files as binary files which suppresses
-	any newline munging it otherwise might do. A work-around for the
-	fact that there is no way yet to set single files to mode '-kb'.
+	If true, all files not otherwise specified using
+	'gitcvs.usecrlfattr' and an explicitly set or unset `crlf`
+	attribute are sent to the client in mode '-kb'. This
+	causes the client to treat them as binary files which
+	suppresses any newline munging it otherwise might do.
 
 gitcvs.dbname::
 	Database used by git-cvsserver to cache revision information
@@ -680,8 +690,9 @@ gitcvs.dbTableNamePrefix::
 	linkgit:git-cvsserver[1] for details).  Any non-alphabetic
 	characters will be replaced with underscores.
 
-All gitcvs variables except for 'gitcvs.allbinary' can also be
-specified as 'gitcvs.<access_method>.<varname>' (where 'access_method'
+All gitcvs variables except for 'gitcvs.usecrlfattr' and
+'gitcvs.allbinary' can also be specified as
+'gitcvs.<access_method>.<varname>' (where 'access_method'
 is one of "ext" and "pserver") to make them apply only for the given
 access method.
 
diff --git a/Documentation/git-cvsserver.txt b/Documentation/git-cvsserver.txt
index b110671..8393028 100644
--- a/Documentation/git-cvsserver.txt
+++ b/Documentation/git-cvsserver.txt
@@ -301,11 +301,27 @@ checkout, diff, status, update, log, add, remove, commit.
 Legacy monitoring operations are not supported (edit, watch and related).
 Exports and tagging (tags and branches) are not supported at this stage.
 
-The server should set the '-k' mode to binary when relevant, however,
-this is not really implemented yet. For now, you can force the server
-to set '-kb' for all files by setting the `gitcvs.allbinary` config
-variable. In proper GIT tradition, the contents of the files are
-always respected. No keyword expansion or newline munging is supported.
+CRLF Line Ending Conversions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+By default the server leaves the '-k' mode blank for all files,
+which causes the cvs client to treat them as a text files, subject
+to crnl conversion on some platforms.
+
+You can make the server use `crnl` attributes to set the '-k' modes
+for files by setting the `gitcvs.usecrlfattr` config variable.
+In this case, if `crlf` is explicitly unset ('-crnl'), then the
+will set '-kb' mode, for binary files.  If it `crlf` is set,
+then the '-k' mode will explicitly be left blank.  See
+also linkgit:gitattributes[5] for more information about the `crlf`
+attribute.
+
+Alternatively, if `gitcvs.usecrlfattr` config is not enabled
+or if the `crlf` attribute is unspecified for a filename, then
+the server uses the `gitcvs.allbinary` for the default setting.
+If `gitcvs.allbinary` is set, then the files not otherwise
+specified will default to '-kb' mode. Otherwise the '-k' mode
+is left blank.
 
 Dependencies
 ------------
diff --git a/git-cvsserver.perl b/git-cvsserver.perl
index 674892b..58206ae 100755
--- a/git-cvsserver.perl
+++ b/git-cvsserver.perl
@@ -502,7 +502,7 @@ sub req_add
                 print $state->{CVSROOT} . "/$state->{module}/$filename\n";
 
                 # this is an "entries" line
-                my $kopts = kopts_from_path($filepart);
+                my $kopts = kopts_from_path($filename);
                 $log->debug("/$filepart/1.$meta->{revision}//$kopts/");
                 print "/$filepart/1.$meta->{revision}//$kopts/\n";
                 # permissions
@@ -533,9 +533,25 @@ sub req_add
 
         print "Checked-in $dirpart\n";
         print "$filename\n";
-        my $kopts = kopts_from_path($filepart);
+        my $kopts = kopts_from_path($filename);
         print "/$filepart/0//$kopts/\n";
 
+        my $requestedKopts = $state->{opt}{k};
+        if(defined($requestedKopts))
+        {
+            $requestedKopts = "-k$requestedKopts";
+        }
+        else
+        {
+            $requestedKopts = "";
+        }
+        if( $kopts ne $requestedKopts )
+        {
+            $log->warn("Ignoring requested -k='$requestedKopts'"
+                        . " for '$filename'; detected -k='$kopts' instead");
+            #TODO: Also have option to send warning to user?
+        }
+
         $addcount++;
     }
 
@@ -615,7 +631,7 @@ sub req_remove
 
         print "Checked-in $dirpart\n";
         print "$filename\n";
-        my $kopts = kopts_from_path($filepart);
+        my $kopts = kopts_from_path($filename);
         print "/$filepart/-1.$wrev//$kopts/\n";
 
         $rmcount++;
@@ -785,6 +801,7 @@ sub req_co
     argsplit("co");
 
     my $module = $state->{args}[0];
+    $state->{module} = $module;
     my $checkout_path = $module;
 
     # use the user specified directory if we're given it
@@ -862,6 +879,7 @@ sub req_co
         # Don't want to check out deleted files
         next if ( $git->{filehash} eq "deleted" );
 
+        my $fullName = $git->{name};
         ( $git->{name}, $git->{dir} ) = filenamesplit($git->{name});
 
        if (length($git->{dir}) && $git->{dir} ne './'
@@ -892,7 +910,7 @@ sub req_co
        print $state->{CVSROOT} . "/$module/" . ( defined ( $git->{dir} ) and $git->{dir} ne "./" ? $git->{dir} . "/" : "" ) . "$git->{name}\n";
 
         # this is an "entries" line
-        my $kopts = kopts_from_path($git->{name});
+        my $kopts = kopts_from_path($fullName);
         print "/$git->{name}/1.$git->{revision}//$kopts/\n";
         # permissions
         print "u=$git->{mode},g=$git->{mode},o=$git->{mode}\n";
@@ -1101,7 +1119,7 @@ sub req_update
 		print $state->{CVSROOT} . "/$state->{module}/$filename\n";
 
 		# this is an "entries" line
-		my $kopts = kopts_from_path($filepart);
+		my $kopts = kopts_from_path($filename);
 		$log->debug("/$filepart/1.$meta->{revision}//$kopts/");
 		print "/$filepart/1.$meta->{revision}//$kopts/\n";
 
@@ -1149,7 +1167,7 @@ sub req_update
                     print "Merged $dirpart\n";
                     $log->debug($state->{CVSROOT} . "/$state->{module}/$filename");
                     print $state->{CVSROOT} . "/$state->{module}/$filename\n";
-                    my $kopts = kopts_from_path($filepart);
+                    my $kopts = kopts_from_path("$dirpart/$filepart");
                     $log->debug("/$filepart/1.$meta->{revision}//$kopts/");
                     print "/$filepart/1.$meta->{revision}//$kopts/\n";
                 }
@@ -1165,7 +1183,7 @@ sub req_update
                 {
                     print "Merged $dirpart\n";
                     print $state->{CVSROOT} . "/$state->{module}/$filename\n";
-                    my $kopts = kopts_from_path($filepart);
+                    my $kopts = kopts_from_path("$dirpart/$filepart");
                     print "/$filepart/1.$meta->{revision}/+/$kopts/\n";
                 }
             }
@@ -1416,7 +1434,7 @@ sub req_ci
             }
             print "Checked-in $dirpart\n";
             print "$filename\n";
-            my $kopts = kopts_from_path($filepart);
+            my $kopts = kopts_from_path($filename);
             print "/$filepart/1.$meta->{revision}//$kopts/\n";
         }
     }
@@ -2296,10 +2314,24 @@ sub kopts_from_path
 {
 	my ($path) = @_;
 
-	# Once it exists, the git attributes system should be used to look up
-	# what attributes apply to this path.
+    if ( defined ( $cfg->{gitcvs}{usecrlfattr} ) and
+         $cfg->{gitcvs}{usecrlfattr} =~ /\s*(1|true|yes)\s*$/i )
+    {
+        my ($val) = check_attr( "crlf", $path );
+        if ( $val eq "set" )
+        {
+            return "";
+        }
+        elsif ( $val eq "unset" )
+        {
+            return "-kb"
+        }
+        else
+        {
+            $log->info("Unrecognized check_attr crlf $path : $val");
+        }
+    }
 
-	# Until then, take the setting from the config file
     unless ( defined ( $cfg->{gitcvs}{allbinary} ) and $cfg->{gitcvs}{allbinary} =~ /^\s*(1|true|yes)\s*$/i )
     {
 		# Return "" to give no special treatment to any path
@@ -2311,6 +2343,23 @@ sub kopts_from_path
     }
 }
 
+sub check_attr
+{
+    my ($attr,$path) = @_;
+    ensureWorkTree();
+    if ( open my $fh, '-|', "git", "check-attr", $attr, "--", $path )
+    {
+        my $val = <$fh>;
+        close $fh;
+        $val =~ s/.*: ([^:\r\n]*)\s*$/$1/;
+        return $val;
+    }
+    else
+    {
+        return undef;
+    }
+}
+
 # Generate a CVS author name from Git author information, by taking
 # the first eight characters of the user part of the email address.
 sub cvs_author
diff --git a/t/t9401-git-cvsserver-crlf.sh b/t/t9401-git-cvsserver-crlf.sh
new file mode 100755
index 0000000..b7a779b
--- /dev/null
+++ b/t/t9401-git-cvsserver-crlf.sh
@@ -0,0 +1,178 @@
+#!/bin/sh
+#
+# Copyright (c) 2008 Matthew Ogilvie
+# Parts adapted from other tests.
+#
+
+test_description='git-cvsserver -kb modes
+
+tests -kb mode for binary files when accessing a git
+repository using cvs CLI client via git-cvsserver server'
+
+. ./test-lib.sh
+
+q_to_nul () {
+    perl -pe 'y/Q/\000/'
+}
+
+q_to_cr () {
+    tr Q '\015'
+}
+
+marked_as () {
+    foundEntry="$(grep "^/$2/" "$1/CVS/Entries")"
+    if [ x"$foundEntry" = x"" ] ; then
+       echo "NOT FOUND: $1 $2 1 $3" >> "${WORKDIR}/marked.log"
+       return 1
+    fi
+    test x"$(grep "^/$2/" "$1/CVS/Entries" | cut -d/ -f5)" = x"$3"
+    stat=$?
+    echo "$1 $2 $stat '$3'" >> "${WORKDIR}/marked.log"
+    return $stat
+}
+
+not_present() {
+    foundEntry="$(grep "^/$2/" "$1/CVS/Entries")"
+    if [ -r "$1/$2" ] ; then
+        echo "Error: File still exists: $1 $2" >> "${WORKDIR}/marked.log"
+        return 1;
+    fi
+    if [ x"$foundEntry" != x"" ] ; then
+        echo "Error: should not have found: $1 $2" >> "${WORKDIR}/marked.log"
+        return 1;
+    else
+        echo "Correctly not found: $1 $2" >> "${WORKDIR}/marked.log"
+        return 0;
+    fi
+}
+
+cvs >/dev/null 2>&1
+if test $? -ne 1
+then
+    test_expect_success 'skipping git-cvsserver tests, cvs not found' :
+    test_done
+    exit
+fi
+perl -e 'use DBI; use DBD::SQLite' >/dev/null 2>&1 || {
+    test_expect_success 'skipping git-cvsserver tests, Perl SQLite interface unavailable' :
+    test_done
+    exit
+}
+
+unset GIT_DIR GIT_CONFIG
+WORKDIR=$(pwd)
+SERVERDIR=$(pwd)/gitcvs.git
+git_config="$SERVERDIR/config"
+CVSROOT=":fork:$SERVERDIR"
+CVSWORK="$(pwd)/cvswork"
+CVS_SERVER=git-cvsserver
+export CVSROOT CVS_SERVER
+
+rm -rf "$CVSWORK" "$SERVERDIR"
+test_expect_success 'setup' '
+    echo "Simple text file" >textfile.c &&
+    echo "File with embedded NUL: Q <- there" | q_to_nul > binfile.bin &&
+    mkdir subdir &&
+    echo "Another text file" > subdir/file.h &&
+    echo "Another binary: Q (this time CR)" | q_to_cr > subdir/withCr.bin &&
+    echo "Mixed up NUL, but marked text: Q <- there" | q_to_nul > mixedUp.c
+    echo "Unspecified" > subdir/unspecified.other &&
+    echo "/*.bin -crlf" > .gitattributes &&
+    echo "/*.c crlf" >> .gitattributes &&
+    echo "subdir/*.bin -crlf" >> .gitattributes &&
+    echo "subdir/*.c crlf" >> .gitattributes &&
+    echo "subdir/file.h crlf" >> .gitattributes &&
+    git add .gitattributes textfile.c binfile.bin mixedUp.c subdir/* &&
+    git commit -q -m "First Commit" &&
+    git clone -q --local --bare "$WORKDIR/.git" "$SERVERDIR" >/dev/null 2>&1 &&
+    GIT_DIR="$SERVERDIR" git config --bool gitcvs.enabled true &&
+    GIT_DIR="$SERVERDIR" git config gitcvs.logfile "$SERVERDIR/gitcvs.log"
+'
+
+test_expect_success 'cvs co (default crlf)' '
+    GIT_CONFIG="$git_config" cvs -Q co -d cvswork master >cvs.log 2>&1 &&
+    test x"$(grep '/-k' cvswork/CVS/Entries cvswork/subdir/CVS/Entries)" = x""
+'
+
+rm -rf cvswork
+test_expect_success 'cvs co (allbinary)' '
+    GIT_DIR="$SERVERDIR" git config --bool gitcvs.allbinary true &&
+    GIT_CONFIG="$git_config" cvs -Q co -d cvswork master >cvs.log 2>&1 &&
+    marked_as cvswork textfile.c -kb &&
+    marked_as cvswork binfile.bin -kb &&
+    marked_as cvswork .gitattributes -kb &&
+    marked_as cvswork mixedUp.c -kb &&
+    marked_as cvswork/subdir withCr.bin -kb &&
+    marked_as cvswork/subdir file.h -kb &&
+    marked_as cvswork/subdir unspecified.other -kb
+'
+
+rm -rf cvswork cvs.log
+test_expect_success 'cvs co (use attributes/allbinary)' '
+    GIT_DIR="$SERVERDIR" git config --bool gitcvs.usecrlfattr true &&
+    GIT_CONFIG="$git_config" cvs -Q co -d cvswork master >cvs.log 2>&1 &&
+    marked_as cvswork textfile.c "" &&
+    marked_as cvswork binfile.bin -kb &&
+    marked_as cvswork .gitattributes -kb &&
+    marked_as cvswork mixedUp.c "" &&
+    marked_as cvswork/subdir withCr.bin -kb &&
+    marked_as cvswork/subdir file.h "" &&
+    marked_as cvswork/subdir unspecified.other -kb
+'
+
+rm -rf cvswork
+test_expect_success 'cvs co (use attributes)' '
+    GIT_DIR="$SERVERDIR" git config --bool gitcvs.allbinary false &&
+    GIT_CONFIG="$git_config" cvs -Q co -d cvswork master >cvs.log 2>&1 &&
+    marked_as cvswork textfile.c "" &&
+    marked_as cvswork binfile.bin -kb &&
+    marked_as cvswork .gitattributes "" &&
+    marked_as cvswork mixedUp.c "" &&
+    marked_as cvswork/subdir withCr.bin -kb &&
+    marked_as cvswork/subdir file.h "" &&
+    marked_as cvswork/subdir unspecified.other ""
+'
+
+test_expect_success 'adding files' '
+    cd cvswork/subdir &&
+    echo "more text" > src.c &&
+    GIT_CONFIG="$git_config" cvs -Q add src.c >cvs.log 2>&1 &&
+    marked_as . src.c "" &&
+    echo "psuedo-binary" > temp.bin &&
+    cd .. &&
+    GIT_CONFIG="$git_config" cvs -Q add subdir/temp.bin >cvs.log 2>&1 &&
+    marked_as subdir temp.bin "-kb" &&
+    cd subdir &&
+    GIT_CONFIG="$git_config" cvs -Q ci -m "adding files" >cvs.log 2>&1 &&
+    marked_as . temp.bin "-kb" &&
+    marked_as . src.c ""
+'
+
+cd "$WORKDIR"
+test_expect_success 'updating' '
+    git pull gitcvs.git &&
+    echo 'hi' > subdir/newfile.bin &&
+    echo 'junk' > subdir/file.h &&
+    echo 'hi' > subdir/newfile.c &&
+    echo 'hello' >> binfile.bin &&
+    git add subdir/newfile.bin subdir/file.h subdir/newfile.c binfile.bin &&
+    git commit -q -m "Add and change some files" &&
+    git push gitcvs.git >/dev/null &&
+    cd cvswork &&
+    GIT_CONFIG="$git_config" cvs -Q update &&
+    cd .. &&
+    marked_as cvswork textfile.c "" &&
+    marked_as cvswork binfile.bin -kb &&
+    marked_as cvswork .gitattributes "" &&
+    marked_as cvswork mixedUp.c "" &&
+    marked_as cvswork/subdir withCr.bin -kb &&
+    marked_as cvswork/subdir file.h "" &&
+    marked_as cvswork/subdir unspecified.other "" &&
+    marked_as cvswork/subdir newfile.bin -kb &&
+    marked_as cvswork/subdir newfile.c "" &&
+    echo "File with embedded NUL: Q <- there" | q_to_nul > tmpExpect1 &&
+    echo "hello" >> tmpExpect1 &&
+    cmp cvswork/binfile.bin tmpExpect1
+'
+
+test_done
-- 
1.5.4.3.340.g97b97

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] git-cvsserver: add ability to guess -kb from contents
  2008-05-15  4:35   ` [PATCH 2/3] implement gitcvs.usecrlfattr Matthew Ogilvie
@ 2008-05-15  4:35     ` Matthew Ogilvie
  0 siblings, 0 replies; 11+ messages in thread
From: Matthew Ogilvie @ 2008-05-15  4:35 UTC (permalink / raw)
  To: git; +Cc: Matthew Ogilvie

If "gitcvs.allbinary" is set to "guess", then any file that has
not been explicitly marked as binary or text using the "crlf" attribute
and the "gitcvs.usecrlfattr" config will guess binary based on the contents
of the file.

Signed-off-by: Matthew Ogilvie <mmogilvi_git@miniinfo.net>
---
 Documentation/config.txt        |   13 ++-
 Documentation/git-cvsserver.txt |   14 ++-
 git-cvsserver.perl              |  193 +++++++++++++++++++++++++++++++++++---
 t/t9401-git-cvsserver-crlf.sh   |  159 ++++++++++++++++++++++++++++++++
 4 files changed, 354 insertions(+), 25 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 820795f..2c867d3 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -655,11 +655,14 @@ gitcvs.usecrlfattr
 	then 'gitcvs.allbinary' is used. See linkgit:gitattribute[5].
 
 gitcvs.allbinary::
-	If true, all files not otherwise specified using
-	'gitcvs.usecrlfattr' and an explicitly set or unset `crlf`
-	attribute are sent to the client in mode '-kb'. This
-	causes the client to treat them as binary files which
-	suppresses any newline munging it otherwise might do.
+	This is used if 'gitcvs.usecrlfattr' does not resolve
+	the correct '-kb' mode to use. If true, all
+	unresolved files are sent to the client in
+	mode '-kb'. This causes the client to treat them
+	as binary files, which suppresses any newline munging it
+	otherwise might do. Alternatively, if it is set to "guess",
+	then the contents of the file are examined to decide if
+	it is binary, similar to 'core.autocrlf'.
 
 gitcvs.dbname::
 	Database used by git-cvsserver to cache revision information
diff --git a/Documentation/git-cvsserver.txt b/Documentation/git-cvsserver.txt
index 8393028..7611c5b 100644
--- a/Documentation/git-cvsserver.txt
+++ b/Documentation/git-cvsserver.txt
@@ -311,17 +311,23 @@ to crnl conversion on some platforms.
 You can make the server use `crnl` attributes to set the '-k' modes
 for files by setting the `gitcvs.usecrlfattr` config variable.
 In this case, if `crlf` is explicitly unset ('-crnl'), then the
-will set '-kb' mode, for binary files.  If it `crlf` is set,
+server will set '-kb' mode for binary files. If `crlf` is set,
 then the '-k' mode will explicitly be left blank.  See
 also linkgit:gitattributes[5] for more information about the `crlf`
 attribute.
 
 Alternatively, if `gitcvs.usecrlfattr` config is not enabled
 or if the `crlf` attribute is unspecified for a filename, then
-the server uses the `gitcvs.allbinary` for the default setting.
-If `gitcvs.allbinary` is set, then the files not otherwise
+the server uses the `gitcvs.allbinary` config for the default setting.
+If `gitcvs.allbinary` is set, then file not otherwise
 specified will default to '-kb' mode. Otherwise the '-k' mode
-is left blank.
+is left blank. But if `gitcvs.allbinary` is set to "guess", then
+the correct '-k' mode will be guessed based on the contents of
+the file.
+
+For best consistency with cvs, it is probably best to override the
+defaults by setting `gitcvs.usecrlfattr` to true,
+and `gitcvs.allbinary` to "guess".
 
 Dependencies
 ------------
diff --git a/git-cvsserver.perl b/git-cvsserver.perl
index 58206ae..920bbe1 100755
--- a/git-cvsserver.perl
+++ b/git-cvsserver.perl
@@ -502,7 +502,7 @@ sub req_add
                 print $state->{CVSROOT} . "/$state->{module}/$filename\n";
 
                 # this is an "entries" line
-                my $kopts = kopts_from_path($filename);
+                my $kopts = kopts_from_path($filename,"sha1",$meta->{filehash});
                 $log->debug("/$filepart/1.$meta->{revision}//$kopts/");
                 print "/$filepart/1.$meta->{revision}//$kopts/\n";
                 # permissions
@@ -533,7 +533,8 @@ sub req_add
 
         print "Checked-in $dirpart\n";
         print "$filename\n";
-        my $kopts = kopts_from_path($filename);
+        my $kopts = kopts_from_path($filename,"file",
+                        $state->{entries}{$filename}{modified_filename});
         print "/$filepart/0//$kopts/\n";
 
         my $requestedKopts = $state->{opt}{k};
@@ -631,7 +632,7 @@ sub req_remove
 
         print "Checked-in $dirpart\n";
         print "$filename\n";
-        my $kopts = kopts_from_path($filename);
+        my $kopts = kopts_from_path($filename,"sha1",$meta->{filehash});
         print "/$filepart/-1.$wrev//$kopts/\n";
 
         $rmcount++;
@@ -910,7 +911,7 @@ sub req_co
        print $state->{CVSROOT} . "/$module/" . ( defined ( $git->{dir} ) and $git->{dir} ne "./" ? $git->{dir} . "/" : "" ) . "$git->{name}\n";
 
         # this is an "entries" line
-        my $kopts = kopts_from_path($fullName);
+        my $kopts = kopts_from_path($fullName,"sha1",$git->{filehash});
         print "/$git->{name}/1.$git->{revision}//$kopts/\n";
         # permissions
         print "u=$git->{mode},g=$git->{mode},o=$git->{mode}\n";
@@ -1119,7 +1120,7 @@ sub req_update
 		print $state->{CVSROOT} . "/$state->{module}/$filename\n";
 
 		# this is an "entries" line
-		my $kopts = kopts_from_path($filename);
+		my $kopts = kopts_from_path($filename,"sha1",$meta->{filehash});
 		$log->debug("/$filepart/1.$meta->{revision}//$kopts/");
 		print "/$filepart/1.$meta->{revision}//$kopts/\n";
 
@@ -1167,7 +1168,8 @@ sub req_update
                     print "Merged $dirpart\n";
                     $log->debug($state->{CVSROOT} . "/$state->{module}/$filename");
                     print $state->{CVSROOT} . "/$state->{module}/$filename\n";
-                    my $kopts = kopts_from_path("$dirpart/$filepart");
+                    my $kopts = kopts_from_path("$dirpart/$filepart",
+                                                "file",$mergedFile);
                     $log->debug("/$filepart/1.$meta->{revision}//$kopts/");
                     print "/$filepart/1.$meta->{revision}//$kopts/\n";
                 }
@@ -1183,7 +1185,8 @@ sub req_update
                 {
                     print "Merged $dirpart\n";
                     print $state->{CVSROOT} . "/$state->{module}/$filename\n";
-                    my $kopts = kopts_from_path("$dirpart/$filepart");
+                    my $kopts = kopts_from_path("$dirpart/$filepart",
+                                                "file",$mergedFile);
                     print "/$filepart/1.$meta->{revision}/+/$kopts/\n";
                 }
             }
@@ -1434,7 +1437,7 @@ sub req_ci
             }
             print "Checked-in $dirpart\n";
             print "$filename\n";
-            my $kopts = kopts_from_path($filename);
+            my $kopts = kopts_from_path($filename,"sha1",$meta->{filehash});
             print "/$filepart/1.$meta->{revision}//$kopts/\n";
         }
     }
@@ -2312,7 +2315,7 @@ sub cleanupTmpDir
 # file should get -kb.
 sub kopts_from_path
 {
-	my ($path) = @_;
+    my ($path, $srcType, $name) = @_;
 
     if ( defined ( $cfg->{gitcvs}{usecrlfattr} ) and
          $cfg->{gitcvs}{usecrlfattr} =~ /\s*(1|true|yes)\s*$/i )
@@ -2332,15 +2335,55 @@ sub kopts_from_path
         }
     }
 
-    unless ( defined ( $cfg->{gitcvs}{allbinary} ) and $cfg->{gitcvs}{allbinary} =~ /^\s*(1|true|yes)\s*$/i )
+    if ( defined ( $cfg->{gitcvs}{allbinary} ) )
     {
-		# Return "" to give no special treatment to any path
-		return "";
-    } else {
-		# Alternatively, to have all files treated as if they are binary (which
-		# is more like git itself), always return the "-kb" option
-		return "-kb";
+        if( ($cfg->{gitcvs}{allbinary} =~ /^\s*(1|true|yes)\s*$/i) )
+        {
+            return "-kb";
+        }
+        elsif( ($cfg->{gitcvs}{allbinary} =~ /^\s*guess\s*$/i) )
+        {
+            if( $srcType eq "sha1Or-k" &&
+                !defined($name) )
+            {
+                my ($ret)=$state->{entries}{$path}{options};
+                if( !defined($ret) )
+                {
+                    $ret=$state->{opt}{k};
+                    if(defined($ret))
+                    {
+                        $ret="-k$ret";
+                    }
+                    else
+                    {
+                        $ret="";
+                    }
+                }
+                if( ! ($ret=~/^(|-kb|-kkv|-kkvl|-kk|-ko|-kv)$/) )
+                {
+                    print "E Bad -k option\n";
+                    $log->warn("Bad -k option: $ret");
+                    die "Error: Bad -k option: $ret\n";
+                }
+
+                return $ret;
+            }
+            else
+            {
+                if( is_binary($srcType,$name) )
+                {
+                    $log->debug("... as binary");
+                    return "-kb";
+                }
+                else
+                {
+                    $log->debug("... as text");
+                }
+            }
+        }
     }
+    # Return "" to give no special treatment to any path
+    return "";
 }
 
 sub check_attr
@@ -2360,6 +2403,124 @@ sub check_attr
     }
 }
 
+# This should have the same heuristics as convert.c:is_binary() and related.
+# Note that the bare CR test is done by callers in convert.c.
+sub is_binary
+{
+    my ($srcType,$name) = @_;
+    $log->debug("is_binary($srcType,$name)");
+
+    # Minimize amount of interpreted code run in the inner per-character
+    # loop for large files, by totalling each character value and
+    # then analyzing the totals.
+    my @counts;
+    my $i;
+    for($i=0;$i<256;$i++)
+    {
+        $counts[$i]=0;
+    }
+
+    my $fh = open_blob_or_die($srcType,$name);
+    my $line;
+    while( defined($line=<$fh>) )
+    {
+        # Any '\0' and bare CR are considered binary.
+        if( $line =~ /\0|(\r[^\n])/ )
+        {
+            close($fh);
+            return 1;
+        }
+
+        # Count up each character in the line:
+        my $len=length($line);
+        for($i=0;$i<$len;$i++)
+        {
+            $counts[ord(substr($line,$i,1))]++;
+        }
+    }
+    close $fh;
+
+    # Don't count CR and LF as either printable/nonprintable
+    $counts[ord("\n")]=0;
+    $counts[ord("\r")]=0;
+
+    # Categorize individual character count into printable and nonprintable:
+    my $printable=0;
+    my $nonprintable=0;
+    for($i=0;$i<256;$i++)
+    {
+        if( $i < 32 &&
+            $i != ord("\b") &&
+            $i != ord("\t") &&
+            $i != 033 &&       # ESC
+            $i != 014 )        # FF
+        {
+            $nonprintable+=$counts[$i];
+        }
+        elsif( $i==127 )  # DEL
+        {
+            $nonprintable+=$counts[$i];
+        }
+        else
+        {
+            $printable+=$counts[$i];
+        }
+    }
+
+    return ($printable >> 7) < $nonprintable;
+}
+
+# Returns open file handle.  Possible invocations:
+#  - open_blob_or_die("file",$filename);
+#  - open_blob_or_die("sha1",$filehash);
+sub open_blob_or_die
+{
+    my ($srcType,$name) = @_;
+    my ($fh);
+    if( $srcType eq "file" )
+    {
+        if( !open $fh,"<",$name )
+        {
+            $log->warn("Unable to open file $name: $!");
+            die "Unable to open file $name: $!\n";
+        }
+    }
+    elsif( $srcType eq "sha1" || $srcType eq "sha1Or-k" )
+    {
+        unless ( defined ( $name ) and $name =~ /^[a-zA-Z0-9]{40}$/ )
+        {
+            $log->warn("Need filehash");
+            die "Need filehash\n";
+        }
+
+        my $type = `git cat-file -t $name`;
+        chomp $type;
+
+        unless ( defined ( $type ) and $type eq "blob" )
+        {
+            $log->warn("Invalid type '$type' for '$name'");
+            die ( "Invalid type '$type' (expected 'blob')" )
+        }
+
+        my $size = `git cat-file -s $name`;
+        chomp $size;
+
+        $log->debug("open_blob_or_die($name) size=$size, type=$type");
+
+        unless( open $fh, '-|', "git", "cat-file", "blob", $name )
+        {
+            $log->warn("Unable to open sha1 $name");
+            die "Unable to open sha1 $name\n";
+        }
+    }
+    else
+    {
+        $log->warn("Unknown type of blob source: $srcType");
+        die "Unknown type of blob source: $srcType\n";
+    }
+    return $fh;
+}
+
 # Generate a CVS author name from Git author information, by taking
 # the first eight characters of the user part of the email address.
 sub cvs_author
diff --git a/t/t9401-git-cvsserver-crlf.sh b/t/t9401-git-cvsserver-crlf.sh
index b7a779b..e27a1c5 100755
--- a/t/t9401-git-cvsserver-crlf.sh
+++ b/t/t9401-git-cvsserver-crlf.sh
@@ -175,4 +175,163 @@ test_expect_success 'updating' '
     cmp cvswork/binfile.bin tmpExpect1
 '
 
+rm -rf cvswork
+test_expect_success 'cvs co (use attributes/guess)' '
+    GIT_DIR="$SERVERDIR" git config gitcvs.allbinary guess &&
+    GIT_CONFIG="$git_config" cvs -Q co -d cvswork master >cvs.log 2>&1 &&
+    marked_as cvswork textfile.c "" &&
+    marked_as cvswork binfile.bin -kb &&
+    marked_as cvswork .gitattributes "" &&
+    marked_as cvswork mixedUp.c "" &&
+    marked_as cvswork/subdir withCr.bin -kb &&
+    marked_as cvswork/subdir file.h "" &&
+    marked_as cvswork/subdir unspecified.other "" &&
+    marked_as cvswork/subdir newfile.bin -kb &&
+    marked_as cvswork/subdir newfile.c ""
+'
+
+test_expect_success 'setup multi-line files' '
+    ( echo "line 1" &&
+      echo "line 2" &&
+      echo "line 3" &&
+      echo "line 4 with NUL: Q <-" ) | q_to_nul > multiline.c &&
+    git add multiline.c &&
+    ( echo "line 1" &&
+      echo "line 2" &&
+      echo "line 3" &&
+      echo "line 4" ) | q_to_nul > multilineTxt.c &&
+    git add multilineTxt.c &&
+    git commit -q -m "multiline files" &&
+    git push gitcvs.git >/dev/null
+'
+
+rm -rf cvswork
+test_expect_success 'cvs co (guess)' '
+    GIT_DIR="$SERVERDIR" git config --bool gitcvs.usecrlfattr false &&
+    GIT_CONFIG="$git_config" cvs -Q co -d cvswork master >cvs.log 2>&1 &&
+    marked_as cvswork textfile.c "" &&
+    marked_as cvswork binfile.bin -kb &&
+    marked_as cvswork .gitattributes "" &&
+    marked_as cvswork mixedUp.c -kb &&
+    marked_as cvswork multiline.c -kb &&
+    marked_as cvswork multilineTxt.c "" &&
+    marked_as cvswork/subdir withCr.bin -kb &&
+    marked_as cvswork/subdir file.h "" &&
+    marked_as cvswork/subdir unspecified.other "" &&
+    marked_as cvswork/subdir newfile.bin "" &&
+    marked_as cvswork/subdir newfile.c ""
+'
+
+test_expect_success 'cvs co another copy (guess)' '
+    GIT_CONFIG="$git_config" cvs -Q co -d cvswork2 master >cvs.log 2>&1 &&
+    marked_as cvswork2 textfile.c "" &&
+    marked_as cvswork2 binfile.bin -kb &&
+    marked_as cvswork2 .gitattributes "" &&
+    marked_as cvswork2 mixedUp.c -kb &&
+    marked_as cvswork2 multiline.c -kb &&
+    marked_as cvswork2 multilineTxt.c "" &&
+    marked_as cvswork2/subdir withCr.bin -kb &&
+    marked_as cvswork2/subdir file.h "" &&
+    marked_as cvswork2/subdir unspecified.other "" &&
+    marked_as cvswork2/subdir newfile.bin "" &&
+    marked_as cvswork2/subdir newfile.c ""
+'
+
+test_expect_success 'add text (guess)' '
+    cd cvswork &&
+    echo "simpleText" > simpleText.c &&
+    GIT_CONFIG="$git_config" cvs -Q add simpleText.c &&
+    cd .. &&
+    marked_as cvswork simpleText.c ""
+'
+
+test_expect_success 'add bin (guess)' '
+    cd cvswork &&
+    echo "simpleBin: NUL: Q <- there" | q_to_nul > simpleBin.bin &&
+    GIT_CONFIG="$git_config" cvs -Q add simpleBin.bin &&
+    cd .. &&
+    marked_as cvswork simpleBin.bin -kb
+'
+
+test_expect_success 'remove files (guess)' '
+    cd cvswork &&
+    GIT_CONFIG="$git_config" cvs -Q rm -f subdir/file.h &&
+    cd subdir &&
+    GIT_CONFIG="$git_config" cvs -Q rm -f withCr.bin &&
+    cd ../.. &&
+    marked_as cvswork/subdir withCr.bin -kb &&
+    marked_as cvswork/subdir file.h ""
+'
+
+test_expect_success 'cvs ci (guess)' '
+    cd cvswork &&
+    GIT_CONFIG="$git_config" cvs -Q ci -m "add/rm files" >cvs.log 2>&1 &&
+    cd .. &&
+    marked_as cvswork textfile.c "" &&
+    marked_as cvswork binfile.bin -kb &&
+    marked_as cvswork .gitattributes "" &&
+    marked_as cvswork mixedUp.c -kb &&
+    marked_as cvswork multiline.c -kb &&
+    marked_as cvswork multilineTxt.c "" &&
+    not_present cvswork/subdir withCr.bin &&
+    not_present cvswork/subdir file.h &&
+    marked_as cvswork/subdir unspecified.other "" &&
+    marked_as cvswork/subdir newfile.bin "" &&
+    marked_as cvswork/subdir newfile.c "" &&
+    marked_as cvswork simpleBin.bin -kb &&
+    marked_as cvswork simpleText.c ""
+'
+
+test_expect_success 'update subdir of other copy (guess)' '
+    cd cvswork2/subdir &&
+    GIT_CONFIG="$git_config" cvs -Q update &&
+    cd ../.. &&
+    marked_as cvswork2 textfile.c "" &&
+    marked_as cvswork2 binfile.bin -kb &&
+    marked_as cvswork2 .gitattributes "" &&
+    marked_as cvswork2 mixedUp.c -kb &&
+    marked_as cvswork2 multiline.c -kb &&
+    marked_as cvswork2 multilineTxt.c "" &&
+    not_present cvswork2/subdir withCr.bin &&
+    not_present cvswork2/subdir file.h &&
+    marked_as cvswork2/subdir unspecified.other "" &&
+    marked_as cvswork2/subdir newfile.bin "" &&
+    marked_as cvswork2/subdir newfile.c "" &&
+    not_present cvswork2 simpleBin.bin &&
+    not_present cvswork2 simpleText.c
+'
+
+echo "starting update/merge" >> "${WORKDIR}/marked.log"
+test_expect_success 'update/merge full other copy (guess)' '
+    git pull gitcvs.git master &&
+    sed "s/3/replaced_3/" < multilineTxt.c > ml.temp &&
+    mv ml.temp multilineTxt.c &&
+    git add multilineTxt.c &&
+    git commit -q -m "modify multiline file" >> "${WORKDIR}/marked.log" &&
+    git push gitcvs.git >/dev/null &&
+    cd cvswork2 &&
+    sed "s/1/replaced_1/" < multilineTxt.c > ml.temp &&
+    mv ml.temp multilineTxt.c &&
+    GIT_CONFIG="$git_config" cvs update > cvs.log 2>&1 &&
+    cd .. &&
+    marked_as cvswork2 textfile.c "" &&
+    marked_as cvswork2 binfile.bin -kb &&
+    marked_as cvswork2 .gitattributes "" &&
+    marked_as cvswork2 mixedUp.c -kb &&
+    marked_as cvswork2 multiline.c -kb &&
+    marked_as cvswork2 multilineTxt.c "" &&
+    not_present cvswork2/subdir withCr.bin &&
+    not_present cvswork2/subdir file.h &&
+    marked_as cvswork2/subdir unspecified.other "" &&
+    marked_as cvswork2/subdir newfile.bin "" &&
+    marked_as cvswork2/subdir newfile.c "" &&
+    marked_as cvswork2 simpleBin.bin -kb &&
+    marked_as cvswork2 simpleText.c "" &&
+    echo "line replaced_1" > tmpExpect2 &&
+    echo "line 2" >> tmpExpect2 &&
+    echo "line replaced_3" >> tmpExpect2 &&
+    echo "line 4" | q_to_nul >> tmpExpect2 &&
+    cmp cvswork2/multilineTxt.c tmpExpect2
+'
+
 test_done
-- 
1.5.4.3.340.g97b97

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] git-cvsserver: Add support for some binary files
  2008-05-15  4:35 [PATCH 0/3] git-cvsserver: Add support for some binary files Matthew Ogilvie
  2008-05-15  4:35 ` [PATCH 1/3] git-cvsserver: add mechanism for managing working tree and current directory Matthew Ogilvie
@ 2008-05-17  0:03 ` Junio C Hamano
  2008-05-18 22:10   ` Matthew Ogilvie
  2008-05-18 22:38   ` Martin Langhoff
  1 sibling, 2 replies; 11+ messages in thread
From: Junio C Hamano @ 2008-05-17  0:03 UTC (permalink / raw)
  To: Matthew Ogilvie; +Cc: git, Martin Langhoff, Frank Lichtenheld

Matthew Ogilvie <mmogilvi_git@miniinfo.net> writes:

> This series of patches extends git-cvsserver to support telling the
> CVS client to set the -kb (binary) mode for files that git considers
> to be binary (and not for text files).  It includes updates to
> documentation and tests.

I am unfortunately not familiar with this part of the system and I'd need
to summon help from experts, but it looks rather nicely done.

I saw a few places that said "crnl" instead of "crlf" in the
documentation, which I munged locally before queuing.

I noticed kopts_from_path in patch 3/3 takes $srcType of "sha1Or-k" but I
could not spot which caller gives such token to the function.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] git-cvsserver: Add support for some binary files
  2008-05-17  0:03 ` [PATCH 0/3] git-cvsserver: Add support for some binary files Junio C Hamano
@ 2008-05-18 22:10   ` Matthew Ogilvie
  2008-05-18 22:38   ` Martin Langhoff
  1 sibling, 0 replies; 11+ messages in thread
From: Matthew Ogilvie @ 2008-05-18 22:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Martin Langhoff, Frank Lichtenheld

On Fri, May 16, 2008 at 05:03:40PM -0700, Junio C Hamano wrote:
> Matthew Ogilvie <mmogilvi_git@miniinfo.net> writes:
> 
> > This series of patches extends git-cvsserver to support telling the
> > CVS client to set the -kb (binary) mode for files that git considers
> > to be binary (and not for text files).  It includes updates to
> > documentation and tests.
> 
> I am unfortunately not familiar with this part of the system and I'd need
> to summon help from experts, but it looks rather nicely done.
> 
> I saw a few places that said "crnl" instead of "crlf" in the
> documentation, which I munged locally before queuing.

Sounds good.

> 
> I noticed kopts_from_path in patch 3/3 takes $srcType of "sha1Or-k" but I
> could not spot which caller gives such token to the function.

Oops.  The "sha1Or-k" cases can and probably should be removed
completely.

I can generate another patch if you would like.

It's a remnant of an approach I had been working on earlier,
when I thought there might be cases when I needed to fall back on
the -k option the user specified on the command line because I didn't
have the file contents.  But careful study revealed what I needed
elsewhere in the data structures.

--
Matthew Ogilvie   [mmogilvi_git@miniinfo.net]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] git-cvsserver: Add support for some binary files
  2008-05-17  0:03 ` [PATCH 0/3] git-cvsserver: Add support for some binary files Junio C Hamano
  2008-05-18 22:10   ` Matthew Ogilvie
@ 2008-05-18 22:38   ` Martin Langhoff
  2008-05-19  7:35     ` Matthew Ogilvie
  1 sibling, 1 reply; 11+ messages in thread
From: Martin Langhoff @ 2008-05-18 22:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Matthew Ogilvie, git, Martin Langhoff, Frank Lichtenheld

On Sat, May 17, 2008 at 12:03 PM, Junio C Hamano <junio@pobox.com> wrote:
> Matthew Ogilvie <mmogilvi_git@miniinfo.net> writes:
>
>> This series of patches extends git-cvsserver to support telling the
>> CVS client to set the -kb (binary) mode for files that git considers
>> to be binary (and not for text files).  It includes updates to
>> documentation and tests.
>
> I am unfortunately not familiar with this part of the system and I'd need
> to summon help from experts, but it looks rather nicely done.

Looks good.

I was at first a bit troubled - "cvsserver doesn't do keyword
expansion anyway" was my first thought - but it makes sense to have
this to help newline-munging clients.

IIRC, one thing that is _not_ handled well in CVS -k flag changes on
the server side (since -k modes are not versioned). If we are
guessing, this may be more likely to happen, or at least more likely
to _surprise_ people.

Matthew, have you had a chance to test k mode changes against clients?
Are we reasonably bug-compatible with the original turd^H^H^Hhing? ;-)

Sorry about the latency!

cheers,



m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] git-cvsserver: Add support for some binary files
  2008-05-18 22:38   ` Martin Langhoff
@ 2008-05-19  7:35     ` Matthew Ogilvie
  2008-05-19  9:34       ` Johannes Schindelin
  2008-05-19 10:53       ` Martin Langhoff
  0 siblings, 2 replies; 11+ messages in thread
From: Matthew Ogilvie @ 2008-05-19  7:35 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Junio C Hamano, git, Martin Langhoff, Frank Lichtenheld

On Mon, May 19, 2008 at 10:38:21AM +1200, Martin Langhoff wrote:
> On Sat, May 17, 2008 at 12:03 PM, Junio C Hamano <junio@pobox.com> wrote:
> > Matthew Ogilvie <mmogilvi_git@miniinfo.net> writes:
> >
> >> This series of patches extends git-cvsserver to support telling the
> >> CVS client to set the -kb (binary) mode for files that git considers
> >> to be binary (and not for text files).  It includes updates to
> >> documentation and tests.
> >
> > I am unfortunately not familiar with this part of the system and I'd need
> > to summon help from experts, but it looks rather nicely done.
> 
> IIRC, one thing that is _not_ handled well in CVS -k flag changes on
> the server side (since -k modes are not versioned). If we are
> guessing, this may be more likely to happen, or at least more likely
> to _surprise_ people.

Since git-cvsserver can (and does) refigure and send a different -k
every time it sends a new version of a file, it could be argued
that git-cvsserver actually fixes the non-versioned -k issue,
at least partly, from some perspectives.  (There's still the issue
of when a new .gitattributes file takes effect, etc.)

> 
> Matthew, have you had a chance to test k mode changes against clients?
> Are we reasonably bug-compatible with the original turd^H^H^Hhing? ;-)

So far I've only done limited testing with Linux CVS 1.12.12.
No newline-munging clients; the test cases just check the
CVS/Entries file, not the file contents.  Most of my testing has
been the test cases included with the patch.

I don't expect any new compatibility problems from this.  The old code
would send exactly the same data for text files and binary files
as this new code.  It was just limited to sending all files as text
or all files as binary (existing gitcvs.allbinary), instead of
allowing a mix.

I'll try to do some newline-munging tests with Cygwin CVS (talking
to a server on Linux) sometime this next week.

I've heard the most finicky CVS client is probably the
one embedded in the Eclipse plugin.  Apparently it has had trouble
with minor tweaks in new versions of official CVS, let alone
an emulation.  But given that I have never even tried Eclipse, I
probably am not a good choice for testing it, and probably wont.

----------

Generally my motivation here is to make it easier for
an organization like my day job to transition to git.  I generally
don't intend to use git-cvsserver myself much, especially not from
platforms that need the newline-munging.

I perceive one remaining big issue for git-cvsserver to be
a good replacement for real CVS: The ability to properly
support "cvs update -r VERSION", where VERSION could
be any branch, tag, CVS version number, or git commit hash.
Git-cvsserver can partially support this by checking out a
totally different sandbox as "cvs checkout VERSION" (notice
no -r), but without the ability to switch versions in place,
that is an awkward workaround at best.  Fixing this seems
really involved (extending the DB scheme, etc), unless
clients either treat the CVS version as an opaque string, or
could be easily patched to do so (so that we could just stick
a GIT hash where a CVS version string is expected).  Also, some
form of submodule support would also be nice, but not
critical (the right thing here is still vague in my mind,
which is hardly surprising as there has recently been
discussion about awkward use cases for submodules even
within native git itself).  Merging with "cvs update -j ..."
and creating tags/branches with CVS are relatively unimportant:
rare enough that they could probably just be reserved for
git itself.  Does anyone have any thoughts about any of this?

            - Matthew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] git-cvsserver: Add support for some binary files
  2008-05-19  7:35     ` Matthew Ogilvie
@ 2008-05-19  9:34       ` Johannes Schindelin
  2008-05-20  3:05         ` Matthew Ogilvie
  2008-05-19 10:53       ` Martin Langhoff
  1 sibling, 1 reply; 11+ messages in thread
From: Johannes Schindelin @ 2008-05-19  9:34 UTC (permalink / raw)
  To: Matthew Ogilvie
  Cc: Martin Langhoff, Junio C Hamano, git, Martin Langhoff,
	Frank Lichtenheld

Hi,

On Mon, 19 May 2008, Matthew Ogilvie wrote:

> I perceive one remaining big issue for git-cvsserver to be a good 
> replacement for real CVS: The ability to properly support "cvs update -r 
> VERSION", where VERSION could be any branch, tag, CVS version number, or 
> git commit hash. Git-cvsserver can partially support this by checking 
> out a totally different sandbox as "cvs checkout VERSION" (notice no 
> -r), but without the ability to switch versions in place, that is an 
> awkward workaround at best.

I might be missing something obvious, but would it not be better to _not_ 
check out anything, but serve every object straight from the object 
database (possibly with CR/LF mangling)?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] git-cvsserver: Add support for some binary files
  2008-05-19  7:35     ` Matthew Ogilvie
  2008-05-19  9:34       ` Johannes Schindelin
@ 2008-05-19 10:53       ` Martin Langhoff
  1 sibling, 0 replies; 11+ messages in thread
From: Martin Langhoff @ 2008-05-19 10:53 UTC (permalink / raw)
  To: Matthew Ogilvie; +Cc: Junio C Hamano, git, Martin Langhoff, Frank Lichtenheld

On Mon, May 19, 2008 at 7:35 PM, Matthew Ogilvie
<mmogilvi_git@miniinfo.net> wrote:
> I've heard the most finicky CVS client is probably the
> one embedded in the Eclipse plugin.  Apparently it has had trouble
> with minor tweaks in new versions of official CVS, let alone
> an emulation.  But given that I have never even tried Eclipse, I
> probably am not a good choice for testing it, and probably wont.

Indeed. We've had endless trouble with Eclipse :-(

> Generally my motivation here is to make it easier for
> an organization like my day job to transition to git.  I generally
> don't intend to use git-cvsserver myself much, especially not from
> platforms that need the newline-munging.

That's exactly the reason why interest in cvsserver is always fleeting
-- people hack on it during their team's transition. Perhaps you can
get some help from an Eclipse-wielding member of your team ;-)

> I perceive one remaining big issue for git-cvsserver to be
> a good replacement for real CVS: The ability to properly
> support "cvs update -r VERSION", where VERSION could

That would be good, and is not too hard. You can mostly simulate that
extending the sqlite DB.

With that in place, a _very_ cool thing would be to add a special
"initial run" script, intended for projects that have just been
imported from a real CVS repo. The initial run script would look at
the CVS repo and add the needed version skew to make the revision
numbers of each file in sqlite match the cvs repo. For sane imports
this would work pretty well, and there's an amount of safe "skew" you
can add for slightly not-sane imports.

The end result is that a project can switch from CVS to git +
git-cvsserver and end users would not need to change their CVS
checkouts at all. Covert cvs->git migration, and users switch to git
on their own schedule.

WRT to your other notes, I agree that cvs update -j -j support isn't
interesting -- users that do merge will want to be on the git side of
things -- but it isn't hard. Submodules is probable not worth the
hassle - at least not yet :-) and a nested CVS checkout works
transparently - in some cases moreso than git submodules!

cheers,



m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] git-cvsserver: Add support for some binary files
  2008-05-19  9:34       ` Johannes Schindelin
@ 2008-05-20  3:05         ` Matthew Ogilvie
  0 siblings, 0 replies; 11+ messages in thread
From: Matthew Ogilvie @ 2008-05-20  3:05 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Martin Langhoff, Junio C Hamano, git, Martin Langhoff,
	Frank Lichtenheld

On Mon, May 19, 2008 at 10:34:47AM +0100, Johannes Schindelin wrote:
> Hi,
>
> On Mon, 19 May 2008, Matthew Ogilvie wrote:
>
> > I perceive one remaining big issue for git-cvsserver to be a good
> > replacement for real CVS: The ability to properly support "cvs update -r
> > VERSION", where VERSION could be any branch, tag, CVS version number, or
> > git commit hash. Git-cvsserver can partially support this by checking
> > out a totally different sandbox as "cvs checkout VERSION" (notice no
> > -r), but without the ability to switch versions in place, that is an
> > awkward workaround at best.
>
> I might be missing something obvious, but would it not be better to _not_
> check out anything, but serve every object straight from the object
> database (possibly with CR/LF mangling)?

Ah, an opportunity to explain a few things about how git-cvsserver
works generally:

1. git-cvsserver serves most objects straight from the object database
already (actually via git-cat-file), and will continue to do
so.  The exception is when it merges user's changed files
with new versions from the repository, for which git-cvsserver
uses temporary files.

2. git-cvsserver does need to use a temporary git index file for
some things, though.  Usually it is using an empty working
directory with such an index file.  User-modified files get
temporary names that get tracked internally, but it might make
sense for modified .gitattributes files (at least) to be put into the
otherwise empty working directory with the right name so that the
changes effect the current cvs command.

3. CR/LF mangling needs to be done by the CVS client, not the server.
A windows client would do such mangling, while a Linux client would not
(both talking to the same server).  With my patch, the server just
tells the client when a file is binary, so that it won't be
mangled even on windows.

4. git-cvsserver currently does not support the "-r" argument to
checkout or update (to get a particular a branch, a tag, or a
version number).  Instead, as a kind of workaround, it has a hook to
treat the CVS "module" argument (primarily intended to be a
project name in a repository with multiple projects) as a branch
or tag name instead.  But the "module" can't be switched on the fly;
you have to checkout a completely new sandbox to get a different
module (or with git-cvsserver, another branch).

- Matthew Ogilvie

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-05-20  3:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-15  4:35 [PATCH 0/3] git-cvsserver: Add support for some binary files Matthew Ogilvie
2008-05-15  4:35 ` [PATCH 1/3] git-cvsserver: add mechanism for managing working tree and current directory Matthew Ogilvie
2008-05-15  4:35   ` [PATCH 2/3] implement gitcvs.usecrlfattr Matthew Ogilvie
2008-05-15  4:35     ` [PATCH 3/3] git-cvsserver: add ability to guess -kb from contents Matthew Ogilvie
2008-05-17  0:03 ` [PATCH 0/3] git-cvsserver: Add support for some binary files Junio C Hamano
2008-05-18 22:10   ` Matthew Ogilvie
2008-05-18 22:38   ` Martin Langhoff
2008-05-19  7:35     ` Matthew Ogilvie
2008-05-19  9:34       ` Johannes Schindelin
2008-05-20  3:05         ` Matthew Ogilvie
2008-05-19 10:53       ` Martin Langhoff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).