From mboxrd@z Thu Jan 1 00:00:00 1970 From: Junio C Hamano Subject: Re: [PATCH] Add ls-files --eol-staged, --eol-worktree Date: Sun, 18 Oct 2015 11:32:17 -0700 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain Cc: git@vger.kernel.org To: Torsten =?utf-8?Q?B=C3=B6gershausen?= X-From: git-owner@vger.kernel.org Sun Oct 18 20:32:41 2015 Return-path: Envelope-to: gcvg-git-2@plane.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Znsl5-0005s8-Ur for gcvg-git-2@plane.gmane.org; Sun, 18 Oct 2015 20:32:40 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752414AbbJRSce (ORCPT ); Sun, 18 Oct 2015 14:32:34 -0400 Received: from mail-pa0-f49.google.com ([209.85.220.49]:33536 "EHLO mail-pa0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752792AbbJRScU (ORCPT ); Sun, 18 Oct 2015 14:32:20 -0400 Received: by pabrc13 with SMTP id rc13so167646425pab.0 for ; Sun, 18 Oct 2015 11:32:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-type; bh=wlOi8qJE3CixTe2gDpgjNVL/4/uTlrW3uwX84STkEiI=; b=TKyrAefXS8vv4Tza6x9vEDnW7kQALr9xeytmPjABEYJPkV1X6YibEftLZIEHivqwTZ gahDYkZGl8KF/01XktggcupFRadwOFWJe7TfMswrIsv0GTOkJeKh0G+jHHdng4TCTJSX N00k34U3E+sNqGODJ4PNL+8+2KC5lXRwR3aon7cmfJOWt7D0Z0WiKD8MGQkTU005wK1C UWifwI7q92pAMQi+y6lBYVkXWX9+f8WsVCOSMuoh+7o3/2t44LQNnM/9b6C/M405Cd2e iZZzrAaRdFNTWGKMmfIkTZC5zDledw6d4N/nVaIjuZ4sKvj3rRCz0b9HXDnDxFGcwsmM FdGg== X-Received: by 10.68.190.105 with SMTP id gp9mr29502332pbc.152.1445193139759; Sun, 18 Oct 2015 11:32:19 -0700 (PDT) Received: from localhost ([2620:0:1000:861b:305b:5af5:2c51:11cd]) by smtp.gmail.com with ESMTPSA id u10sm32023697pbs.63.2015.10.18.11.32.17 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sun, 18 Oct 2015 11:32:18 -0700 (PDT) In-Reply-To: (Junio C. Hamano's message of "Sun, 18 Oct 2015 10:35:23 -0700") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: Junio C Hamano writes: > If you say 1 and 2 are with LF, 4 and 6 are with CRLF, eveyrything > else is mixed, then you are losing the distinction between 1 and 2 > (and 4 and 6) that you made when the files were a single liner (with > or without the incomplete line ending). Is that desirable? Continuing this line of thought further. My answer is "it may not be desirable, but it is sufficient to decide what line ending to use when I am adding a new line to the existing contents". That is, to a file with a single incomplete line, I can add my new line with either LF or CRLF and the resulting one will become LF (or CRLF) that ends in an incomplete line. To a file whose lines consistently use LF, I can only add my new line with LF, whether the original ends in an incomplete line. So from that point of view,... > I wonder if it would be easier for the scripts that process the > output from this command to handle if the report said what > combination of _three_ possible line-ending is used. i.e. does the > file contain a line that ends with LF? does the file contain a line > that ends with CRLF? does the file contain a line with missing EOL? ... instead of saying there are three possible line-endings, we can stick to two, i.e. "is it text or binary" followed by "among two possible endings, which ones are used", i.e. text text-lf text-crlf text-lf-crlf which matches the last four lines of what you had in the "like this" example (and I prefer "mixed" over "lf-crlf"). So I am OK with the categorization after all with respect to the possible incomplete line at the end. But if that is what the feature is designed for, the documentation must say it very clearly, i.e. "this is to allow you decide what line ending to use when you add a new line to the existing contents" or something. And viewed from that angle, there is no reason to special-case an empty line. Knowing "binary" is important because you want to be able to say "whether LF or CRLF, you do *not* want to add your new line to this binary file."; "empty" is just like "text"---you can use either and get a coherent result. So I'd suggest sticking to these classification tokens: binary text crlf mixed lf The "adding my line at the beginning of the file" script can do something like this using them (here I am simplifying by making your feature available to "git get-eol" command that takes a single path and does your computation): case "$(git get-eol file)" in text | lf) printf "%s\n" "$mine" ;; crlf) printf "%s\r\n" "$mine" ;; *) # that is 'binary' or 'mixed' die "do not muck with the contents of file" ;; esac cat file Also this points at another direction of using the three independent line ending conventions I suggested earlier. What you want to append your lines? You would want to know if the file ends with an incomplete line, so you would rather want to be told with a set of categories like this instead: binary incomplete incomplete,crlf incomplete,mixed incomplete,lf crlf mixed lf Note that an empty file will get an empty string as the grouping above, as it does not have any line ending (i.e. no crlf/mixed/lf), does not end with an incomplete line and is not a binary file. And the using script would become: existing=$(git get-eol file) eol='\n' case ",$existing" in ,binary | *,mixed) die "do not muck with the contents of file" ;; ,*crlf) eol='\r\n' ;; esac cat file case "$existing," in incomplete,*) printf "$eol" ;; esac printf "%s$eol" "$mine"