From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?Q?Torsten_B=c3=b6gershausen?= Subject: [PATCH v7] ls-files: Add eol diagnostics Date: Mon, 7 Dec 2015 18:09:13 +0100 Message-ID: <5665BD39.1040403@web.de> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: tboegi@web.de To: git@vger.kernel.org X-From: git-owner@vger.kernel.org Mon Dec 07 18:09:25 2015 Return-path: Envelope-to: gcvg-git-2@plane.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1a5zHw-0005Um-2m for gcvg-git-2@plane.gmane.org; Mon, 07 Dec 2015 18:09:24 +0100 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932084AbbLGRJT convert rfc822-to-quoted-printable (ORCPT ); Mon, 7 Dec 2015 12:09:19 -0500 Received: from mout.web.de ([212.227.17.11]:57920 "EHLO mout.web.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753967AbbLGRJR (ORCPT ); Mon, 7 Dec 2015 12:09:17 -0500 Received: from birne9.local ([213.66.56.100]) by smtp.web.de (mrweb103) with ESMTPSA (Nemesis) id 0M8iPI-1ZyEyd1ZJE-00C6EM; Mon, 07 Dec 2015 18:09:15 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 X-Provags-ID: V03:K0:9pgQu8yMqXTBjjHs1OB3cmFnIg9W1PlhZK1wdsLY4ho3tDN/ClP VQaVmjapfwfbjqDaJPqUvKJzcc+lz7ypXcOjyH0kjgy+rcTmcuQYw6Bm7uqCVOxOFLdfYer OGcoclisxl/OruhXIgYPJa/3sg6RJ6CHJxsLO4N92W5+calhQxp1AVpxbckDh0o45fk3UmK aokxB0VBe4tjxYx72847g== X-UI-Out-Filterresults: notjunk:1;V01:K0:HCzX21Wnma8=:pWhdwgMxa3dgybyKSTJlU8 bKm68Q2Mt2zsJ5SrcbPFMjnZAIUQVFG53LZN4AaSycnAMtD/DHe8WJrJfiifXvke+k7T66+dJ nVmQS6HAbAqTHc4bxcHVvNH+FAjRkJMvSmHX+c0oCoYZvKEfh2Z1Vf34cMO4DN2ANeuNhaMEo SDOxixVsDiNsSElVNCp7VWvCFiAoEalJPgYNkk1hoaMhdo76ayOy34DeKAekhAeluBXGOQv6b scYxtDg+BeihgrIfACyW0bahnLSKHmx+R0D3v2pTy9QysGT4v5wQS1QH018ISQX6d8EU1Ex2C 6aJ0wd5PmsGDDPE/q6NTQzNbrD0wubEaze6j4KLAT387YgPJw5RSOVlpobSYFKAUhUCXyVjhb O4q+1R3ZFNJ1n0nR6pYl21rJqECIsFTXGRGLiSD4Qy8/poSyMAa50I3PAaDCmehFJ3rSASWKh cGBUn1kYQoT+NPLzcAYxkeTfWt5nv9hIehzsXD33a/ZlXKMyWvAoVaFfCMGbnvVk5XvHS8QLs DdCDuNdkSqDpDfbKwfU5zf9q3zi951vnWb2eF7IiPp1UuWB4uM3u/dBjogWB/BpkW/By+5XQu yBy9+Iy+YLqLrw7oiPuzUm2dZ9bUoInHhRSN0h7wIlxWMSwk2SILb+LnWB75az/lp0DOZTO2w M4mH7c1Zl+2ZwpG36bRPO1IOvzRDBD/ruWy9fx1fDa5wBciZrgCkbS1yWWKQksUyPyZTh1SYG 32QFRTWjhlBCMSDhKSR3MFLgM+xGQ2NokZ9SPd0SRYjXmPaNK5oDmv2tUkGwXfWtjRAIKVi3 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: When working in a cross-platform environment, a user wants to check if text files are stored normalized in the repository and if =2Egitattributes are set appropriately. Make it possible to let Git show the line endings in the index and in the working tree and the effective text/eol attributes. The end of line ("eolinfo") are shown like this: "binary" binary file "text-no-eol" text file without any EOL "text-lf" text file with LF "text-crlf" text file with CRLF "text-crlf-lf" text file with mixed line endings. The effective text/eol attribute is one of these: "", "-text", "text", "text=3Dauto", "eol=3Dlf", "eol=3Dcrlf" git ls-files --eol gives an output like this: i/text-no-eol w/text-no-eol attr/text=3Dauto t/t5100/empty i/binary w/binary attr/-text t/test-binary-2.png i/text-lf w/text-lf attr/eol=3Dlf t/t5100/rfc2047-info-0= 007 i/text-lf w/text-crlf attr/eol=3Dcrlf doit.bat i/text-crlf-lf w/text-crlf-lf attr/ locale/XX.po Note that the output is meant to be human-readable and may change. Helped-By: Eric Sunshine Signed-off-by: Torsten B=C3=B6gershausen --- Changes since v6: - Fixed potential memory leak in convert.c, when strbuf_read_file() fails. - t0027: Cleanups (empty lines, egrep, un-needed quoting) test_when_finished 'rm e expect actual'=20 There doesn't seem to be 100% consistency when and how to remove file= s. (I think if we create files, we should be able to remove them: use "rm" rather than "rm -f") Add comment about the "last test case", which removes file to run 'git ls-files -d" Documentation/git-ls-files.txt | 22 +++++++++ builtin/ls-files.c | 19 ++++++++ convert.c | 85 ++++++++++++++++++++++++++++++++ convert.h | 3 ++ t/t0027-auto-crlf.sh | 108 +++++++++++++++++++++++++++++++++= +++----- 5 files changed, 226 insertions(+), 11 deletions(-) diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-file= s.txt index e26f01f..8f29c99 100644 --- a/Documentation/git-ls-files.txt +++ b/Documentation/git-ls-files.txt @@ -12,6 +12,7 @@ SYNOPSIS 'git ls-files' [-z] [-t] [-v] (--[cached|deleted|others|ignored|stage|unmerged|killed|modified])* (-[c|d|o|i|s|u|k|m])* + [--eol] [-x |--exclude=3D] [-X |--exclude-from=3D] [--exclude-per-directory=3D] @@ -147,6 +148,18 @@ a space) at the start of each line: possible for manual inspection; the exact format may change at any time. =20 +--eol:: + Show line endings ("eolinfo") and the text/eol attributes ("texteolat= tr") of files. + "eolinfo" is the file content identification used by Git when + the "text" attribute is "auto", or core.autocrlf !=3D false. + + "eolinfo" is either "" (when the the info is not available"), or one = of "binary", + "text-no-eol", "text-lf", "text-crlf" or "text-crlf-lf". + The "texteolattr" can be "", "-text", "text", "text=3Dauto", "eol=3Dl= f", "eol=3Dcrlf". + + Both the content in the index ("i/") and the content in the working t= ree ("w/") + are shown for regular files, followed by the "texteolattr ("attr/"). + \--:: Do not interpret any more arguments as options. =20 @@ -161,6 +174,15 @@ which case it outputs: =20 [ ] =20 +'git ls-files --eol' will show + i/ w/ attr/ + +'git ls-files --eol -o' will show + i/ w/ attr/ + +'git ls-files --eol -s' will show +[ ] i/ w/ attr/= + 'git ls-files --unmerged' and 'git ls-files --stage' can be used to ex= amine detailed information on unmerged paths. =20 diff --git a/builtin/ls-files.c b/builtin/ls-files.c index b6a7cb0..ef892bc 100644 --- a/builtin/ls-files.c +++ b/builtin/ls-files.c @@ -27,6 +27,7 @@ static int show_killed; static int show_valid_bit; static int line_terminator =3D '\n'; static int debug_mode; +static int show_eol; =20 static const char *prefix; static int max_prefix_len; @@ -47,6 +48,21 @@ static const char *tag_modified =3D ""; static const char *tag_skip_worktree =3D ""; static const char *tag_resolve_undo =3D ""; =20 +static void write_eolinfo(const struct cache_entry *ce, const char *pa= th) +{ + struct stat st; + const char *i_txt =3D ""; + const char *w_txt =3D ""; + if (!show_eol) + return; + if (ce && S_ISREG(ce->ce_mode)) + i_txt =3D get_cached_convert_stats_ascii(ce->name); + if (!lstat(path, &st) && (S_ISREG(st.st_mode))) + w_txt =3D get_wt_convert_stats_ascii(path); + printf("i/%-13s w/%-13s attr/%-9s ", i_txt, w_txt, + get_convert_attr_ascii(path)); +} + static void write_name(const char *name) { /* @@ -68,6 +84,7 @@ static void show_dir_entry(const char *tag, struct di= r_entry *ent) return; =20 fputs(tag, stdout); + write_eolinfo(NULL, ent->name); write_name(ent->name); } =20 @@ -170,6 +187,7 @@ static void show_ce_entry(const char *tag, const st= ruct cache_entry *ce) find_unique_abbrev(ce->sha1,abbrev), ce_stage(ce)); } + write_eolinfo(ce, ce->name); write_name(ce->name); if (debug_mode) { const struct stat_data *sd =3D &ce->ce_stat_data; @@ -433,6 +451,7 @@ int cmd_ls_files(int argc, const char **argv, const= char *cmd_prefix) OPT_BIT(0, "directory", &dir.flags, N_("show 'other' directories' names only"), DIR_SHOW_OTHER_DIRECTORIES), + OPT_BOOL(0, "eol", &show_eol, N_("show line endings of files")), OPT_NEGBIT(0, "empty-directory", &dir.flags, N_("don't show empty directories"), DIR_HIDE_EMPTY_DIRECTORIES), diff --git a/convert.c b/convert.c index 814e814..61d6757 100644 --- a/convert.c +++ b/convert.c @@ -13,6 +13,11 @@ * translation when the "text" attribute or "auto_crlf" option is set. */ =20 +/* Stat bits: When BIN is set, the txt bits are unset */ +#define CONVERT_STAT_BITS_TXT_LF (1) +#define CONVERT_STAT_BITS_TXT_CRLF (2) +#define CONVERT_STAT_BITS_BIN (4) + enum crlf_action { CRLF_GUESS =3D -1, CRLF_BINARY =3D 0, @@ -95,6 +100,62 @@ static int is_binary(unsigned long size, struct tex= t_stat *stats) return 0; } =20 +static unsigned int gather_convert_stats(const char *data, unsigned lo= ng size) +{ + struct text_stat stats; + if (!data || !size) + return 0; + gather_stats(data, size, &stats); + if (is_binary(size, &stats) || stats.cr !=3D stats.crlf) + return CONVERT_STAT_BITS_BIN; + else if (stats.crlf && (stats.crlf =3D=3D stats.lf)) + return CONVERT_STAT_BITS_TXT_CRLF; + else if (stats.crlf && stats.lf) + return CONVERT_STAT_BITS_TXT_CRLF | CONVERT_STAT_BITS_TXT_LF; + else if (stats.lf) + return CONVERT_STAT_BITS_TXT_LF; + else + return 0; +} + +static const char *gather_convert_stats_ascii(const char *data, unsign= ed long size) +{ + unsigned int convert_stats =3D gather_convert_stats(data, size); + + if (convert_stats & CONVERT_STAT_BITS_BIN) + return "binary"; + switch (convert_stats) { + case CONVERT_STAT_BITS_TXT_LF: + return("text-lf"); + case CONVERT_STAT_BITS_TXT_CRLF: + return("text-crlf"); + case CONVERT_STAT_BITS_TXT_LF | CONVERT_STAT_BITS_TXT_CRLF: + return("text-crlf-lf"); + default: + return ("text-no-eol"); + } +} + +const char *get_cached_convert_stats_ascii(const char *path) +{ + const char *ret; + unsigned long sz; + void *data =3D read_blob_data_from_cache(path, &sz); + ret =3D gather_convert_stats_ascii(data, sz); + free(data); + return ret; +} + +const char *get_wt_convert_stats_ascii(const char *path) +{ + const char *ret =3D ""; + struct strbuf sb =3D STRBUF_INIT; + if (strbuf_read_file(&sb, path, 0) >=3D 0) + ret =3D gather_convert_stats_ascii(sb.buf, sb.len); + strbuf_release(&sb); + return ret; +} + static enum eol output_eol(enum crlf_action crlf_action) { switch (crlf_action) { @@ -777,6 +838,30 @@ int would_convert_to_git_filter_fd(const char *pat= h) return apply_filter(path, NULL, 0, -1, NULL, ca.drv->clean); } =20 +const char *get_convert_attr_ascii(const char *path) +{ + struct conv_attrs ca; + enum crlf_action crlf_action; + + convert_attrs(&ca, path); + crlf_action =3D input_crlf_action(ca.crlf_action, ca.eol_attr); + switch (crlf_action) { + case CRLF_GUESS: + return ""; + case CRLF_BINARY: + return "-text"; + case CRLF_TEXT: + return "text"; + case CRLF_INPUT: + return "eol=3Dlf"; + case CRLF_CRLF: + return "eol=3Dcrlf"; + case CRLF_AUTO: + return "text=3Dauto"; + } + return ""; +} + int convert_to_git(const char *path, const char *src, size_t len, struct strbuf *dst, enum safe_crlf checksafe) { diff --git a/convert.h b/convert.h index d9d853c..ccf436b 100644 --- a/convert.h +++ b/convert.h @@ -32,6 +32,9 @@ enum eol { }; =20 extern enum eol core_eol; +extern const char *get_cached_convert_stats_ascii(const char *path); +extern const char *get_wt_convert_stats_ascii(const char *path); +extern const char *get_convert_attr_ascii(const char *path); =20 /* returns 1 if *dst was used */ extern int convert_to_git(const char *path, const char *src, size_t le= n, diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh index b343651..a89f2ba 100755 --- a/t/t0027-auto-crlf.sh +++ b/t/t0027-auto-crlf.sh @@ -56,21 +56,16 @@ create_gitattributes () { } =20 create_NNO_files () { - lfname=3D$1 - crlfname=3D$2 - lfmixcrlf=3D$3 - lfmixcr=3D$4 - crlfnul=3D$5 for crlf in false true input do for attr in "" auto text -text lf crlf do pfx=3DNNO_${crlf}_attr_${attr} && - cp $lfname ${pfx}_LF.txt && - cp $crlfname ${pfx}_CRLF.txt && - cp $lfmixcrlf ${pfx}_CRLF_mix_LF.txt && - cp $lfmixcr ${pfx}_LF_mix_CR.txt && - cp $crlfnul ${pfx}_CRLF_nul.txt + cp CRLF_mix_LF ${pfx}_LF.txt && + cp CRLF_mix_LF ${pfx}_CRLF.txt && + cp CRLF_mix_LF ${pfx}_CRLF_mix_LF.txt && + cp CRLF_mix_LF ${pfx}_LF_mix_CR.txt && + cp CRLF_mix_LF ${pfx}_CRLF_nul.txt done done } @@ -96,7 +91,7 @@ commit_check_warn () { crlfnul=3D$7 pfx=3Dcrlf_${crlf}_attr_${attr} create_gitattributes "$attr" && - for f in LF CRLF repoMIX LF_mix_CR CRLF_mix_LF LF_nul CRLF_nul + for f in LF CRLF LF_mix_CR CRLF_mix_LF LF_nul CRLF_nul do fname=3D${pfx}_$f.txt && cp $f $fname && @@ -149,6 +144,36 @@ commit_chk_wrnNNO () { ' } =20 +stats_ascii () { + case "$1" in + LF) + echo text-lf + ;; + CRLF) + echo text-crlf + ;; + CRLF_mix_LF) + echo text-crlf-lf + ;; + LF_mix_CR) + echo binary + ;; + CRLF_nul) + echo binary + ;; + LF_nul) + echo binary + ;; + CRLF_mix_CR) + echo binary + ;; + *) + echo error_invalid $1 + ;; + esac + +} + check_files_in_repo () { crlf=3D$1 attr=3D$2 @@ -214,6 +239,20 @@ checkout_files () { fi done =20 + test_expect_success "ls-files --eol $lfname ${pfx}LF.txt" " + test_when_finished 'rm e expect actual' && + cat >e <<-EOF && + i/text-crlf w/$(stats_ascii $crlfname) ${src}CRLF.txt + i/text-crlf-lf w/$(stats_ascii $lfmixcrlf) ${src}CRLF_mix_LF.txt + i/text-lf w/$(stats_ascii $lfname) ${src}LF.txt + i/binary w/$(stats_ascii $lfmixcr) ${src}LF_mix_CR.txt + i/binary w/$(stats_ascii $crlfnul) ${src}CRLF_nul.txt + i/binary w/$(stats_ascii $crlfnul) ${src}LF_nul.txt + EOF + sort expect && + git ls-files --eol $src* | sed -e 's!attr/[=3Da-z-]*!!g' -e 's/ */ = /g' | sort >actual && + test_cmp expect actual + " test_expect_success "checkout core.eol=3D$eol core.autocrlf=3D$crlf g= itattributes=3D$attr file=3DLF" " compare_ws_file $pfx $lfname ${src}LF.txt " @@ -231,6 +270,37 @@ checkout_files () { " } =20 +# Test control characters +# NUL SOH CR EOF=3D=3D^Z +test_expect_success 'ls-files --eol -o Text/Binary' ' + test_when_finished "rm e expect actual TeBi_*" && + STRT=3DAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA && + STR=3D$STRT$STRT$STRT$STRT && + printf "${STR}BBB\001" >TeBi_127_S && + printf "${STR}BBBB\001">TeBi_128_S && + printf "${STR}BBB\032" >TeBi_127_E && + printf "\032${STR}BBB" >TeBi_E_127 && + printf "${STR}BBBB\000">TeBi_128_N && + printf "${STR}BBB\012">TeBi_128_L && + printf "${STR}BBB\015">TeBi_127_C && + printf "${STR}BB\015\012" >TeBi_126_CL && + printf "${STR}BB\015\012\015" >TeBi_126_CLC && + cat >e <<-\EOF && + i/ w/binary TeBi_127_S + i/ w/text-no-eol TeBi_128_S + i/ w/text-no-eol TeBi_127_E + i/ w/binary TeBi_E_127 + i/ w/binary TeBi_128_N + i/ w/text-lf TeBi_128_L + i/ w/binary TeBi_127_C + i/ w/text-crlf TeBi_126_CL + i/ w/binary TeBi_126_CLC + EOF + sort expect && + git ls-files --eol -o | grep TeBi_ | sed -e 's!attr/[=3Da-z-]*!!g' -e= "s/ */ /g" | sort >actual && + test_cmp expect actual +' + ####### test_expect_success 'setup master' ' echo >.gitattributes && @@ -480,4 +550,20 @@ checkout_files native true "lf" LF CR= LF CRLF_mix_LF LF_mix_CR checkout_files native false "crlf" CRLF CRLF CRLF CRL= =46_mix_CR CRLF_nul checkout_files native true "crlf" CRLF CRLF CRLF CRL= =46_mix_CR CRLF_nul =20 + +# Should be the last test case: remove some files from the worktree +# run 'git ls-files -d' +test_expect_success 'ls-files --eol -d' " + rm crlf_false_attr__CRLF.txt crlf_false_attr__CRLF_mix_LF.txt crlf_f= alse_attr__LF.txt .gitattributes && + cat >expect <<-\EOF && + i/text-crlf w/ crlf_false_attr__CRLF.txt + i/text-crlf-lf w/ crlf_false_attr__CRLF_mix_LF.txt + i/text-lf w/ .gitattributes + i/text-lf w/ crlf_false_attr__LF.txt + EOF + git ls-files --eol -d | sed -e 's!attr/[=3Da-z-]*!!g' -e 's/ */ /g' = | sort >actual && + test_cmp expect actual && + rm expect actual +" + test_done --=20 2.6.2.403.gd7a84e3