* Re: [PATCH] Add ls-files --eol-staged, --eol-worktree
2015-10-17 20:18 [PATCH] Add ls-files --eol-staged, --eol-worktree Torsten Bögershausen
@ 2015-10-18 1:10 ` Eric Sunshine
2015-10-18 10:03 ` Philip Oakley
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Eric Sunshine @ 2015-10-18 1:10 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: Git List
On Sat, Oct 17, 2015 at 4:18 PM, Torsten Bögershausen <tboegi@web.de> wrote:
> Make it possible to show the line endings of files.
> Files which are staged and/or files in the working tree:
>
> git ls-files --eol-staged
> git ls-files --eol-worktree
>
> Both will show an output like this:
>
> empty empty_file
> bin binary_file_or_with_cr_handled_as_binary
> txt-crlf text_file_with_crlf
> txt-mix text_file_with_crlf_and_lf
> txt-lf text_file_with_lf
> txt text_file_with_no_eol_at_all
>
> Implementation details:
> Make struct text_stat, is_binary() and gather_stats() from convert.c
> public, add a new function get_convert_stats_ascii() and use it
> in and use them in ls-files.
s/and use it in and use them in/and use them in/
> git ls-files --eol-staged will give a line like this:
"... line like this:" what?
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -68,6 +70,11 @@ static void show_dir_entry(const char *tag, struct dir_entry *ent)
> return;
>
> fputs(tag, stdout);
> + if (show_eol_wt) {
> + printf("%s ", get_convert_stats_ascii(ent->name,
> + GET_CONVERT_STATS_ASCII_WT, 0));
Whitespace-damaged patch?
> + }
Style: unnecessary braces
> +
> write_name(ent->name);
> }
>
> @@ -170,6 +177,14 @@ static void show_ce_entry(const char *tag, const struct cache_entry *ce)
> find_unique_abbrev(ce->sha1,abbrev),
> ce_stage(ce));
> }
> + if (show_eol_staged) {
> + printf("%s ",
> + get_convert_stats_ascii(ce->name, GET_CONVERT_STATS_ASCII_BLOB, 0));
> + }
> + if (show_eol_wt) {
> + printf("%s ", get_convert_stats_ascii(ce->name,GET_CONVERT_STATS_ASCII_WT,
> + ce->ce_stat_data.sd_size));
> + }
Style: unnecessary braces (both cases)
> write_name(ce->name);
> if (debug_mode) {
> const struct stat_data *sd = &ce->ce_stat_data;
> @@ -206,6 +221,10 @@ static void show_ru_info(void)
> printf("%s%06o %s %d\t", tag_resolve_undo, ui->mode[i],
> find_unique_abbrev(ui->sha1[i], abbrev),
> i + 1);
> + if (show_eol_wt) {
> + printf("%s ",
> + get_convert_stats_ascii(path, GET_CONVERT_STATS_ASCII_WT, 0));
> + }
Style: unnecessary braces
> write_name(path);
> }
> }
> @@ -433,6 +452,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
> OPT_BIT(0, "directory", &dir.flags,
> N_("show 'other' directories' names only"),
> DIR_SHOW_OTHER_DIRECTORIES),
> + OPT_BOOL(0, "eol-staged", &show_eol_staged, N_("show line endings of the staged file")),
> + OPT_BOOL(0, "eol-worktree", &show_eol_wt, N_("show line endings of the file in work tree")),
> OPT_NEGBIT(0, "empty-directory", &dir.flags,
> N_("don't show empty directories"),
> DIR_HIDE_EMPTY_DIRECTORIES),
> --- a/convert.c
> +++ b/convert.c
> @@ -95,6 +87,45 @@ static int is_binary(unsigned long size, struct text_stat *stats)
> return 0;
> }
>
> +
Style: unnecessary blank line
> +const char *gather_stats_ascii(const char *data, unsigned long size)
Is this name too generic? This implementation is specialized for
summarizing EOL state, but it is conceivable that there may be other
meaningful textual representations of struct text_stat, as well, so
perhaps this name ought to better reflect the EOL-centric textual
representation of this implementation.
> +{
> + struct text_stat stats;
> + if (!data || !size)
> + return("empty ");
Would it make sense to distinguish between an empty file/blob and one
which was not found?
if (!data)
return "missing";
if (!size)
return "empy";
> + gather_stats(data, size, &stats);
> + if (is_binary(size, &stats))
> + return("bin ");
Style: unnecessary parentheses on 'return'
> + else if (stats.cr != stats.crlf)
Style (nit): unnecessary 'else'
> + return("bin ");
> + else if (stats.crlf && (stats.crlf == stats.lf))
> + return("txt-crlf");
> + else if (stats.crlf && stats.lf)
> + return("txt-mix ");
> + else if (stats.lf)
> + return("txt-lf ");
> + else
> + return("txt ");
> +}
Is it a good idea for this otherwise general purpose function to be
conflating presentation with EOL classification? I would have expected
this function to return the raw classification string without trailing
whitespace ("empty", "bin", "txt-mix", etc.), and leave it up to the
caller to pad the string if alignment is desired.
> +const char *get_convert_stats_ascii(const char *path, int flags, size_t hint)
Is 'flags' an appropriate name? Presently, it's more of a "mode" than
a flag. If it is a flag, then you typically would use an unsigned
type.
If you can't foresee any additional "flags", and this really is a
"mode", then it might be clearer to split this into two functions --
one for blobs and one for files -- and callers can invoke whichever is
appropriate.
> +{
> + const char *ret = "";
Unnecessary initialization since all branches assign 'ret' unconditionally.
> + if (flags & GET_CONVERT_STATS_ASCII_BLOB) {
> + unsigned long sz;
> + void *data = read_blob_data_from_cache(path, &sz);
> + ret = gather_stats_ascii(data, sz);
gather_stats_ascii() deals gracefully with data== NULL, so it can
safely be outside the 'if (data)' conditional. Okay.
> + if (data)
> + free(data);
POSIX says that free(NULL) is safe, so the conditional is unnecessary.
> + } else if (flags & GET_CONVERT_STATS_ASCII_WT){
Style: space before {
> + struct strbuf sb = STRBUF_INIT;
> + strbuf_read_file(&sb, path, hint);
If gather_stats_ascii() should distinguish error and empty-file cases,
then checking the return value of strbuf_read_file() would be
appropriate.
> + ret = gather_stats_ascii(sb.buf, sb.len);
> + strbuf_release(&sb);
> + }
> + return ret;
> +}
> +
> --- a/convert.h
> +++ b/convert.h
> @@ -31,6 +31,20 @@ enum eol {
> +struct text_stat {
> + /* NUL, CR, LF and CRLF counts */
> + unsigned nul, cr, lf, crlf;
> +
> + /* These are just approximations! */
> + unsigned printable, nonprintable;
> +};
> +void gather_stats(const char *buf, unsigned long size, struct text_stat *stats);
> +int is_binary(unsigned long size, struct text_stat *stats);
> +const char *gather_stats_ascii(const char *buf, unsigned long size);
> +#define GET_CONVERT_STATS_ASCII_BLOB (1<<0)
> +#define GET_CONVERT_STATS_ASCII_WT (1<<1)
> +const char *get_convert_stats_ascii(const char *path, int isBlob, size_t hint);
s/isBlob/flags/ or s/isBlob/mode/ or something.
Having read the source code, I know what 'hint' is for, but its
purpose is not obvious when just reading this prototype. Also, do you
expect 'hint' to prove worthwhile or might it be a case of premature
optimization and interface complexity? (Genuine question.)
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] Add ls-files --eol-staged, --eol-worktree
2015-10-17 20:18 [PATCH] Add ls-files --eol-staged, --eol-worktree Torsten Bögershausen
2015-10-18 1:10 ` Eric Sunshine
@ 2015-10-18 10:03 ` Philip Oakley
2015-10-18 17:35 ` Junio C Hamano
2015-10-18 19:00 ` Junio C Hamano
3 siblings, 0 replies; 8+ messages in thread
From: Philip Oakley @ 2015-10-18 10:03 UTC (permalink / raw)
To: Torsten Bögershausen, git; +Cc: tboegi
From: "Torsten Bögershausen" <tboegi@web.de>
> Make it possible to show the line endings of files.
> Files which are staged and/or files in the working tree:
>
> git ls-files --eol-staged
> git ls-files --eol-worktree
>
> Both will show an output like this:
>
> empty empty_file
> bin binary_file_or_with_cr_handled_as_binary
> txt-crlf text_file_with_crlf
> txt-mix text_file_with_crlf_and_lf
> txt-lf text_file_with_lf
> txt text_file_with_no_eol_at_all
I think that this last generic 'txt' should be explicit, e.g. 'txt-no-eols',
so that the categories are explicit and mutually exclusive.
Also, does this need a documentation update for the options, and can the
distinction between txt and bin be documented /referenced.
>
> Implementation details:
> Make struct text_stat, is_binary() and gather_stats() from convert.c
> public, add a new function get_convert_stats_ascii() and use it
> in and use them in ls-files.
> git ls-files --eol-staged will give a line like this:
>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---
> This needs to go on top of tb/t0027-crlf
>
> builtin/ls-files.c | 21 +++++++++++++++++++++
> convert.c | 51
> +++++++++++++++++++++++++++++++++++++++++----------
> convert.h | 14 ++++++++++++++
> 3 files changed, 76 insertions(+), 10 deletions(-)
>
> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index b6a7cb0..c989e94 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -27,6 +27,8 @@ static int show_killed;
> static int show_valid_bit;
> static int line_terminator = '\n';
> static int debug_mode;
> +static int show_eol_staged;
> +static int show_eol_wt;
>
> static const char *prefix;
> static int max_prefix_len;
> @@ -68,6 +70,11 @@ static void show_dir_entry(const char *tag, struct
> dir_entry *ent)
> return;
>
> fputs(tag, stdout);
> + if (show_eol_wt) {
> + printf("%s ", get_convert_stats_ascii(ent->name,
> + GET_CONVERT_STATS_ASCII_WT, 0));
> + }
> +
> write_name(ent->name);
> }
>
> @@ -170,6 +177,14 @@ static void show_ce_entry(const char *tag, const
> struct cache_entry *ce)
> find_unique_abbrev(ce->sha1,abbrev),
> ce_stage(ce));
> }
> + if (show_eol_staged) {
> + printf("%s ",
> + get_convert_stats_ascii(ce->name, GET_CONVERT_STATS_ASCII_BLOB, 0));
> + }
> + if (show_eol_wt) {
> + printf("%s ",
> get_convert_stats_ascii(ce->name,GET_CONVERT_STATS_ASCII_WT,
> + ce->ce_stat_data.sd_size));
> + }
> write_name(ce->name);
> if (debug_mode) {
> const struct stat_data *sd = &ce->ce_stat_data;
> @@ -206,6 +221,10 @@ static void show_ru_info(void)
> printf("%s%06o %s %d\t", tag_resolve_undo, ui->mode[i],
> find_unique_abbrev(ui->sha1[i], abbrev),
> i + 1);
> + if (show_eol_wt) {
> + printf("%s ",
> + get_convert_stats_ascii(path, GET_CONVERT_STATS_ASCII_WT, 0));
> + }
> write_name(path);
> }
> }
> @@ -433,6 +452,8 @@ int cmd_ls_files(int argc, const char **argv, const
> char *cmd_prefix)
> OPT_BIT(0, "directory", &dir.flags,
> N_("show 'other' directories' names only"),
> DIR_SHOW_OTHER_DIRECTORIES),
> + OPT_BOOL(0, "eol-staged", &show_eol_staged, N_("show line endings of the
> staged file")),
> + OPT_BOOL(0, "eol-worktree", &show_eol_wt, N_("show line endings of the
> file in work tree")),
> OPT_NEGBIT(0, "empty-directory", &dir.flags,
> N_("don't show empty directories"),
> DIR_HIDE_EMPTY_DIRECTORIES),
> diff --git a/convert.c b/convert.c
> index 814e814..a1c24cd 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -22,15 +22,7 @@ enum crlf_action {
> CRLF_AUTO
> };
>
> -struct text_stat {
> - /* NUL, CR, LF and CRLF counts */
> - unsigned nul, cr, lf, crlf;
> -
> - /* These are just approximations! */
> - unsigned printable, nonprintable;
> -};
> -
> -static void gather_stats(const char *buf, unsigned long size, struct
> text_stat *stats)
> +void gather_stats(const char *buf, unsigned long size, struct text_stat
> *stats)
> {
> unsigned long i;
>
> @@ -76,7 +68,7 @@ static void gather_stats(const char *buf, unsigned long
> size, struct text_stat *
> /*
> * The same heuristics as diff.c::mmfile_is_binary()
> */
> -static int is_binary(unsigned long size, struct text_stat *stats)
> +int is_binary(unsigned long size, struct text_stat *stats)
> {
>
> if (stats->nul)
> @@ -95,6 +87,45 @@ static int is_binary(unsigned long size, struct
> text_stat *stats)
> return 0;
> }
>
> +
> +const char *gather_stats_ascii(const char *data, unsigned long size)
> +{
> + struct text_stat stats;
> + if (!data || !size)
> + return("empty ");
> + gather_stats(data, size, &stats);
> + if (is_binary(size, &stats))
> + return("bin ");
> + else if (stats.cr != stats.crlf)
> + return("bin ");
> + else if (stats.crlf && (stats.crlf == stats.lf))
> + return("txt-crlf");
> + else if (stats.crlf && stats.lf)
> + return("txt-mix ");
> + else if (stats.lf)
> + return("txt-lf ");
> + else
> + return("txt ");
> +}
> +
> +const char *get_convert_stats_ascii(const char *path, int flags, size_t
> hint)
> +{
> + const char *ret = "";
> + if (flags & GET_CONVERT_STATS_ASCII_BLOB) {
> + unsigned long sz;
> + void *data = read_blob_data_from_cache(path, &sz);
> + ret = gather_stats_ascii(data, sz);
> + if (data)
> + free(data);
> + } else if (flags & GET_CONVERT_STATS_ASCII_WT){
> + struct strbuf sb = STRBUF_INIT;
> + strbuf_read_file(&sb, path, hint);
> + ret = gather_stats_ascii(sb.buf, sb.len);
> + strbuf_release(&sb);
> + }
> + return ret;
> +}
> +
> static enum eol output_eol(enum crlf_action crlf_action)
> {
> switch (crlf_action) {
> diff --git a/convert.h b/convert.h
> index d9d853c..566cf0e 100644
> --- a/convert.h
> +++ b/convert.h
> @@ -31,6 +31,20 @@ enum eol {
> #endif
> };
>
> +struct text_stat {
> + /* NUL, CR, LF and CRLF counts */
> + unsigned nul, cr, lf, crlf;
> +
> + /* These are just approximations! */
> + unsigned printable, nonprintable;
> +};
> +void gather_stats(const char *buf, unsigned long size, struct text_stat
> *stats);
> +int is_binary(unsigned long size, struct text_stat *stats);
> +const char *gather_stats_ascii(const char *buf, unsigned long size);
> +#define GET_CONVERT_STATS_ASCII_BLOB (1<<0)
> +#define GET_CONVERT_STATS_ASCII_WT (1<<1)
> +const char *get_convert_stats_ascii(const char *path, int isBlob, size_t
> hint);
> +
> extern enum eol core_eol;
>
> /* returns 1 if *dst was used */
> --
> 2.6.1.443.g36d7748
>
> --
Philip
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] Add ls-files --eol-staged, --eol-worktree
2015-10-17 20:18 [PATCH] Add ls-files --eol-staged, --eol-worktree Torsten Bögershausen
2015-10-18 1:10 ` Eric Sunshine
2015-10-18 10:03 ` Philip Oakley
@ 2015-10-18 17:35 ` Junio C Hamano
2015-10-18 18:32 ` Junio C Hamano
2015-10-18 19:00 ` Junio C Hamano
3 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2015-10-18 17:35 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git
Torsten Bögershausen <tboegi@web.de> writes:
> Make it possible to show the line endings of files.
> Files which are staged and/or files in the working tree:
>
> git ls-files --eol-staged
> git ls-files --eol-worktree
>
> Both will show an output like this:
>
> empty empty_file
> bin binary_file_or_with_cr_handled_as_binary
> txt-crlf text_file_with_crlf
> txt-mix text_file_with_crlf_and_lf
> txt-lf text_file_with_lf
> txt text_file_with_no_eol_at_all
This _may_ be readable by who implemented it, but I cannot tell what
you are trying to say above; "like this" does not help me at all
deciphering.
Do I get the above after doing this?
$ >empty
$ >empty2
$ git ls-files --eol-worktree
empty empty_file
empty2 empty_file
or do you mean this?
$ >empty_file
$ >empty_file2
$ git ls-files --eol-worktree
empty empty_file empty_file2
or do you produce output one path at a time?
$ >empty_file
$ >empty_file2
$ git ls-files --eol-worktree
empty empty_file
empty empty_file2
> txt text_file_with_no_eol_at_all
Does it help to have this category at all? This gives the user an
indication that this file consists on a single incomplete line and
differentiates it from a file with the same single line with LF or
CR/LF line ending. What happens when I prepend to these three files
another copy of the same line with LF or CR/LF line ending? You get
6 variations:
1. line LF line
2. line LF line LF
3. line LF line CRLF
4. line CRLF line
5. line CRLF line LF
6. line CRLF line CRLF
If you say 1 and 2 are with LF, 4 and 6 are with CRLF, eveyrything
else is mixed, then you are losing the distinction between 1 and 2
(and 4 and 6) that you made when the files were a single liner (with
or without the incomplete line ending). Is that desirable?
I wonder if it would be easier for the scripts that process the
output from this command to handle if the report said what
combination of _three_ possible line-ending is used. i.e. does the
file contain a line that ends with LF? does the file contain a line
that ends with CRLF? does the file contain a line with missing EOL?
> Implementation details:
> Make struct text_stat, is_binary() and gather_stats() from convert.c
> public, add a new function get_convert_stats_ascii() and use it
> in and use them in ls-files.
> git ls-files --eol-staged will give a line like this:
>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---
This is even worse than the "like this" above ;-)
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] Add ls-files --eol-staged, --eol-worktree
2015-10-18 17:35 ` Junio C Hamano
@ 2015-10-18 18:32 ` Junio C Hamano
0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2015-10-18 18:32 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git
Junio C Hamano <gitster@pobox.com> writes:
> If you say 1 and 2 are with LF, 4 and 6 are with CRLF, eveyrything
> else is mixed, then you are losing the distinction between 1 and 2
> (and 4 and 6) that you made when the files were a single liner (with
> or without the incomplete line ending). Is that desirable?
Continuing this line of thought further. My answer is "it may not
be desirable, but it is sufficient to decide what line ending to use
when I am adding a new line to the existing contents". That is, to
a file with a single incomplete line, I can add my new line with
either LF or CRLF and the resulting one will become LF (or CRLF)
that ends in an incomplete line. To a file whose lines consistently
use LF, I can only add my new line with LF, whether the original ends
in an incomplete line.
So from that point of view,...
> I wonder if it would be easier for the scripts that process the
> output from this command to handle if the report said what
> combination of _three_ possible line-ending is used. i.e. does the
> file contain a line that ends with LF? does the file contain a line
> that ends with CRLF? does the file contain a line with missing EOL?
... instead of saying there are three possible line-endings, we can
stick to two, i.e. "is it text or binary" followed by "among two
possible endings, which ones are used", i.e.
text
text-lf
text-crlf
text-lf-crlf
which matches the last four lines of what you had in the "like this"
example (and I prefer "mixed" over "lf-crlf").
So I am OK with the categorization after all with respect to the
possible incomplete line at the end. But if that is what the
feature is designed for, the documentation must say it very
clearly, i.e. "this is to allow you decide what line ending to use
when you add a new line to the existing contents" or something.
And viewed from that angle, there is no reason to special-case an
empty line. Knowing "binary" is important because you want to be
able to say "whether LF or CRLF, you do *not* want to add your new
line to this binary file."; "empty" is just like "text"---you can
use either and get a coherent result.
So I'd suggest sticking to these classification tokens:
binary
text
crlf
mixed
lf
The "adding my line at the beginning of the file" script can do
something like this using them (here I am simplifying by making your
feature available to "git get-eol" command that takes a single path
and does your computation):
case "$(git get-eol file)" in
text | lf)
printf "%s\n" "$mine"
;;
crlf)
printf "%s\r\n" "$mine"
;;
*) # that is 'binary' or 'mixed'
die "do not muck with the contents of file"
;;
esac
cat file
Also this points at another direction of using the three independent
line ending conventions I suggested earlier. What you want to
append your lines? You would want to know if the file ends with an
incomplete line, so you would rather want to be told with a set of
categories like this instead:
binary
incomplete
incomplete,crlf
incomplete,mixed
incomplete,lf
crlf
mixed
lf
Note that an empty file will get an empty string as the grouping
above, as it does not have any line ending (i.e. no crlf/mixed/lf),
does not end with an incomplete line and is not a binary file.
And the using script would become:
existing=$(git get-eol file)
eol='\n'
case ",$existing" in
,binary | *,mixed)
die "do not muck with the contents of file"
;;
,*crlf)
eol='\r\n'
;;
esac
cat file
case "$existing," in
incomplete,*)
printf "$eol"
;;
esac
printf "%s$eol" "$mine"
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Add ls-files --eol-staged, --eol-worktree
2015-10-17 20:18 [PATCH] Add ls-files --eol-staged, --eol-worktree Torsten Bögershausen
` (2 preceding siblings ...)
2015-10-18 17:35 ` Junio C Hamano
@ 2015-10-18 19:00 ` Junio C Hamano
2015-10-19 5:23 ` Torsten Bögershausen
3 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2015-10-18 19:00 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git
Torsten Bögershausen <tboegi@web.de> writes:
> Make it possible to show the line endings of files.
> Files which are staged and/or files in the working tree:
>
> git ls-files --eol-staged
> git ls-files --eol-worktree
Two unrelated (to the issues raised in other review responses)
issues in the UI:
- While I can see how the new feature would be useful, I am not
convinced that it is a good idea to add it to ls-files. Does the
option work well with other existing options like -s, -t, etc?
Does it make sense to combine it with other options like -m, -d,
etc? I have this suspicion that "check-attr", "check-ignore",
etc. may give a better model that fits this feature better,
i.e. "git check-eol".
- When you have an operation that works on contents in the working
tree, the established command line convention to alter the
operation to work on contents in the index is with "--cached",
and the operation by default would work on the contents in the
working tee without "--worktree". See gitcli(7).
If I were doing this as part of "ls-files", I would add a new
"--get-eol" option that inspects the working tree, and make
"--get-eol --cached" do the same for the contents in the index,
for consistency. If I were doing "git check-eol", then the
default mode of the operation would read from the working tree,
and "git check-eol --cached" would read from the index.
If the operation can work on both the contents in the index and
in the working tree at the same time, we use "--index" instead of
"--cached", but I do not think it applies here (only a small
number of commands work on both to begin with, e.g. "apply").
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] Add ls-files --eol-staged, --eol-worktree
2015-10-18 19:00 ` Junio C Hamano
@ 2015-10-19 5:23 ` Torsten Bögershausen
2015-10-19 6:32 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Torsten Bögershausen @ 2015-10-19 5:23 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, philipoakley, tboegi
On 18/10/15 21:00, Junio C Hamano wrote:
> Torsten Bögershausen <tboegi@web.de> writes:
>
>> Make it possible to show the line endings of files.
>> Files which are staged and/or files in the working tree:
>>
>> git ls-files --eol-staged
>> git ls-files --eol-worktree
> Two unrelated (to the issues raised in other review responses)
> issues in the UI:
>
> - While I can see how the new feature would be useful, I am not
> convinced that it is a good idea to add it to ls-files. Does the
> option work well with other existing options like -s, -t, etc?
> Does it make sense to combine it with other options like -m, -d,
> etc? I have this suspicion that "check-attr", "check-ignore",
> etc. may give a better model that fits this feature better,
> i.e. "git check-eol".
>
(This should answer all comments, thanks everybody
@Erics Sunshine: Thanks for the review, dropped you from cc list because web.de
can't find an MX record)
I like this idea:
binary
text
crlf
mixed
lf
----------------
$ git ls-files --eol-staged -s
[snip]
100644 981f810e80008d878d6a5af1331c89dc093c5927 0 txt-lf worktree.c
---------------------
$ rm Documentation/RelNotes/2.7.0.txt
$ echo "/* */" >>builtin/ls-files.c
$ ls-files -m --eol-worktree
empty Documentation/RelNotes/2.7.0.txt
txt-lf builtin/ls-files.c
-----------------------
(The empty is a bug, thanks Eric)
------------------------------
$ ./git-ls-files --eol-worktree -t
[snip]
$ H txt-lf zlib.c
-------------------------
My understanding is that the eol options work togther with the existing option,
as long as it makes sense (but see below)
"git check-attr" will even report attributes for a file, that doesn't even exist.
"git ls-files is a command which by default operates on the staged area, unless
I mis-understand it.
And that is the main purpose:
Tell me how which line endings your staged files have, and I can tell you that
you may
consider to normalize these files because "git status", "git blame" consider
these files as changed.
(From that point of view,
"git ls-files --eol" could be the way to report the staged eols.
But then users would ask:
but why can't you tell me what I have in my worktree ?
"git ls-files --eol-worktree" could be the answer (or "git ls-files -o
--eol-worktree" )
I was thinking about adding "git check-eol", but didn't want to introduce just
another command,
as the syntax and options (-z, -o -x -X) overlap much with ls-files
Is it the common understanding to add a new command is the best solution ?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Add ls-files --eol-staged, --eol-worktree
2015-10-19 5:23 ` Torsten Bögershausen
@ 2015-10-19 6:32 ` Junio C Hamano
0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2015-10-19 6:32 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: git, philipoakley
Torsten Bögershausen <tboegi@web.de> writes:
> I like this idea:
>
> binary
> text
> crlf
> mixed
> lf
If you really like it, it would mean that my attempt to use Socratic
method to enlighten you why the above is not a good idea failed X-<.
> ----------------
> $ git ls-files --eol-staged -s
> [snip]
> 100644 981f810e80008d878d6a5af1331c89dc093c5927 0 txt-lf worktree.c
Does it even make sense to give --eol-worktree in this case?
> My understanding is that the eol options work togther with the existing option,
> as long as it makes sense (but see below)
>
> "git check-attr" will even report attributes for a file, that doesn't even exist.
Both "ls-files -o/-i" talk about untracked paths, so that is not a
very useful and valid objection, is it?
> "git ls-files is a command which by default operates on the staged
> area, unless I mis-understand it.
It is even worse than that. It is true that "ls-files [-s]" is
about "--cached" and there is no equivalent to show the working tree
version. But "-t", "-d", etc. are not about the state in the index
nor the state in the working tree. They are about the relationship
between these two states.
What the new operation wants to do, if I understand correctly, is
either check the blob contents in the working tree or in the index,
which is not a good fit with what the rest of "ls-files" does for
exactly that reason. The inability to mix -s with --eol-worktree
is another natural consequence of this.
> I was thinking about adding "git check-eol", but didn't want to
> introduce just another command,
Between adding a new command that does one thing well and whose user
interaction is coherent with the rest of the system, and adding a
new operation mode to an existing command and makes the user
interaction of that existing command more incoherent by introducing
two variants --foo and --foo-worktree when there is no existing
option that has similar variant pair, I'd say we prefer to see a new
command.
The -z output, and --stdin input are what we would want to have for
the new command, but I do not think we want it to know -o, -x or -X.
You would instead pipe output from ls-files with these options to
the new command run with the --stdin option.
^ permalink raw reply [flat|nested] 8+ messages in thread