* \b character escapes in CLI usage
@ 2025-02-25 23:44 Yaakov Smith
2025-02-26 7:38 ` Jeff King
0 siblings, 1 reply; 15+ messages in thread
From: Yaakov Smith @ 2025-02-25 23:44 UTC (permalink / raw)
To: git@vger.kernel.org
Hi all,
I'm not sure if this is a bug, but it definitely feels like a bug.
A colleague of mine was going through the documentation for the git configuration file format and noticed that \b is a permitted escape.
In some places, such as trying to fetch a remote with this in the URL, git will render the character differently.
[remote "backslashb"]
url = "\b"
fetch = +refs/heads/*:refs/remotes/backslashb/*
$ git fetch backslashb
fatal: '?' does not appear to be a git repository
fatal: Could not read from remote repository.
When using "git config --list" however, this is emitted in its raw format, and can be used to mask or hide an actual (probably invalid) value:
$ cat .git/config
[core]
somevalue = "true\b\b\b\bfalse"
$ git config --local --list
core.somevalue=false
Should "git config" be smarter here and print something other than a literal backspace to the terminal, like "git fetch" does?
[System Info]
git version:
git version 2.34.1
cpu: x86_64
no commit associated with this build
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
uname: Linux 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64
compiler info: gnuc: 11.4
libc info: glibc: 2.35
$SHELL (typically, interactive shell): /bin/bash
Kind regards,
Yaakov Smith
Principal Software Engineer
Pronouns: he/him
e yaakov.smith@wisetechglobal.com
t +61 (2) 8001 2200
d +61 (2) 8986 2753
wisetechglobal.com
Enabling and empowering the world's supply chains.
This email is subject to our Confidentiality Statement
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: \b character escapes in CLI usage 2025-02-25 23:44 \b character escapes in CLI usage Yaakov Smith @ 2025-02-26 7:38 ` Jeff King 2025-02-26 8:09 ` Jeff King ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Jeff King @ 2025-02-26 7:38 UTC (permalink / raw) To: Yaakov Smith; +Cc: git@vger.kernel.org On Tue, Feb 25, 2025 at 11:44:33PM +0000, Yaakov Smith wrote: > In some places, such as trying to fetch a remote with this in the URL, git will render the character differently. > > [remote "backslashb"] > url = "\b" > fetch = +refs/heads/*:refs/remotes/backslashb/* > > $ git fetch backslashb > fatal: '?' does not appear to be a git repository > fatal: Could not read from remote repository. Here we sanitize error output, because we know the result is human readable (and likely to be showing untrusted input from the repo or a remote server). > When using "git config --list" however, this is emitted in its raw format, and can be used to mask or hide an actual (probably invalid) value: > > $ cat .git/config > [core] > somevalue = "true\b\b\b\bfalse" > $ git config --local --list > core.somevalue=false But here, the point of "git config" is to show the output. If we sanitized it (especially in a lossy way like we do for error messages), then any program reading the output would not see the real data. > Should "git config" be smarter here and print something other than a > literal backspace to the terminal, like "git fetch" does? So I would say no here, in general. We could perhaps try to be kinder about sanitizing output when it is going to a terminal, rather than a pipe. But quite curiously, that should already be the case for "config --list"! It invokes a pager by default. Much to my surprise, though, "less" does not seem to treat backspace as a control character. It can be configured to do so: $ LESS=FRXU git config --list --local ... core.foo=true^H^H^H^Hfalse Here's what the manpage for less(1) says: By default, if neither -u nor -U is given, backspaces which appear adjacent to an underscore character are treated specially: the underlined text is displayed using the terminal's hardware underlining capability. Also, backspaces which appear between two identical characters are treated specially: the overstruck text is printed using the terminal's hardware boldface capability. Other backspaces are deleted, along with the preceding character.[...] So I guess it is intentional to allow programs to use some effects, but in general I think I might prefer them being marked visually. Especially because the same would be true in a diff, like: git init echo old >file && git add file && git commit -m old printf 'sneaky\b\b\b\b\bnew\n' >file && git commit -m new git show which respects the backspaces (actually it says "snew" with a bolded "n" because of the overstrike rule ;) ). I wonder if we should consider adding "U" to the default $LESS variable we set. -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: \b character escapes in CLI usage 2025-02-26 7:38 ` Jeff King @ 2025-02-26 8:09 ` Jeff King 2025-02-26 16:02 ` Junio C Hamano 2025-02-26 16:38 ` Kyle Lippincott 2025-02-26 15:59 ` Junio C Hamano 2025-02-26 23:36 ` brian m. carlson 2 siblings, 2 replies; 15+ messages in thread From: Jeff King @ 2025-02-26 8:09 UTC (permalink / raw) To: Yaakov Smith; +Cc: git@vger.kernel.org On Wed, Feb 26, 2025 at 02:38:23AM -0500, Jeff King wrote: > I wonder if we should consider adding "U" to the default $LESS variable > we set. Having tried this for 5 minutes, the answer is a resounding "no". It also treats tabs as control characters, making source code diffs rather ugly. ;) In modern versions of less you can get around it with: LESS="-U --proc-tab" or: LESS="--PROC-BACKSPACE" but those are new in less 632, from the last year or two. So I don't think we can rely on it in our default variable, but people with recent versions of less should consider setting it. Looks like it was added for exactly this case: https://github.com/gwsw/less/issues/335 -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: \b character escapes in CLI usage 2025-02-26 8:09 ` Jeff King @ 2025-02-26 16:02 ` Junio C Hamano 2025-02-26 16:38 ` Kyle Lippincott 1 sibling, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2025-02-26 16:02 UTC (permalink / raw) To: Jeff King; +Cc: Yaakov Smith, git@vger.kernel.org Jeff King <peff@peff.net> writes: > On Wed, Feb 26, 2025 at 02:38:23AM -0500, Jeff King wrote: > >> I wonder if we should consider adding "U" to the default $LESS variable >> we set. > > Having tried this for 5 minutes, the answer is a resounding "no". It > also treats tabs as control characters, making source code diffs rather > ugly. ;) ;-) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: \b character escapes in CLI usage 2025-02-26 8:09 ` Jeff King 2025-02-26 16:02 ` Junio C Hamano @ 2025-02-26 16:38 ` Kyle Lippincott 2025-02-26 22:06 ` Jeff King 1 sibling, 1 reply; 15+ messages in thread From: Kyle Lippincott @ 2025-02-26 16:38 UTC (permalink / raw) To: Jeff King; +Cc: Yaakov Smith, git@vger.kernel.org On Wed, Feb 26, 2025 at 12:09 AM Jeff King <peff@peff.net> wrote: > > On Wed, Feb 26, 2025 at 02:38:23AM -0500, Jeff King wrote: > > > I wonder if we should consider adding "U" to the default $LESS variable > > we set. > > Having tried this for 5 minutes, the answer is a resounding "no". It > also treats tabs as control characters, making source code diffs rather > ugly. ;) > > In modern versions of less you can get around it with: > > LESS="-U --proc-tab" > > or: > > LESS="--PROC-BACKSPACE" > > but those are new in less 632, from the last year or two. So I don't > think we can rely on it in our default variable, but people with recent > versions of less should consider setting it. From another issue (https://github.com/gwsw/less/issues/557) I learned you can do this: LESSKEY_CONTENT='#env;#version>=632 LESS=${LESS} --PROC-BACKSPACE' I haven't tested it yet, but that might be a decent solution? I don't know how composable those are; e.g. if you wanted both --PROC-BACKSPACE on >=632 and --no-poll on >=670, I'm *assuming* you can do that, but I don't know what the syntax looks like. > > Looks like it was added for exactly this case: > > https://github.com/gwsw/less/issues/335 > > -Peff > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: \b character escapes in CLI usage 2025-02-26 16:38 ` Kyle Lippincott @ 2025-02-26 22:06 ` Jeff King 0 siblings, 0 replies; 15+ messages in thread From: Jeff King @ 2025-02-26 22:06 UTC (permalink / raw) To: Kyle Lippincott; +Cc: Yaakov Smith, git@vger.kernel.org On Wed, Feb 26, 2025 at 08:38:15AM -0800, Kyle Lippincott wrote: > > In modern versions of less you can get around it with: > > > > LESS="-U --proc-tab" > > > > or: > > > > LESS="--PROC-BACKSPACE" > > > > but those are new in less 632, from the last year or two. So I don't > > think we can rely on it in our default variable, but people with recent > > versions of less should consider setting it. > > From another issue (https://github.com/gwsw/less/issues/557) I learned > you can do this: > > LESSKEY_CONTENT='#env;#version>=632 LESS=${LESS} --PROC-BACKSPACE' > > I haven't tested it yet, but that might be a decent solution? I don't > know how composable those are; e.g. if you wanted both > --PROC-BACKSPACE on >=632 and --no-poll on >=670, I'm *assuming* you > can do that, but I don't know what the syntax looks like. Thanks for the pointer, I didn't know about that. I think it is fully composable, as the "#version" conditional just applies to one line. So: LESSKEY_CONTENT='#env;#version>=632 LESS=${LESS} --PROC-BACKSPACE;#version >=670 LESS=${LESS} --no-poll' However, I couldn't get even the basic version to work. Turns out that LESSKEY_CONTENT was added in 645, and I'm running 643 from Debian unstable). So it kind-of works for our case if we make 645 the minimum (and don't help versions between 632 and 645 at all). I think we could get even hackier to support old versions like: # probably would be $prefix/share/git/lesskey in a real install fn=/tmp/lesskey { echo "#env" echo "#version >= 632 LESS=${LESS} --PROC-BACKSPACE" } >"$fn" # This works back to less 582. Before that we have to compile it to a # binary format with "lesskey" and point to it with $LESSKEY. LESSKEYIN=$fn git log I guess we'd also need to put more details into setup_pager_env(). Right now its logic is just "do not set $FOO if $FOO is already set". But we'd probably want rules like "if $LESS is set, do not try to override it with $LESSKEY_CONTENT". I'm inclined to punt on it for a while. People can set up $LESS themselves based on what they have available, and once these features have been around for a while, we might then consider adding them to our defaults. -Peff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: \b character escapes in CLI usage 2025-02-26 7:38 ` Jeff King 2025-02-26 8:09 ` Jeff King @ 2025-02-26 15:59 ` Junio C Hamano 2025-02-26 23:36 ` brian m. carlson 2 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2025-02-26 15:59 UTC (permalink / raw) To: Jeff King; +Cc: Yaakov Smith, git@vger.kernel.org Jeff King <peff@peff.net> writes: > I wonder if we should consider adding "U" to the default $LESS variable > we set. Thanks for analyzing the "less" issue. We should be OK if we lost the overstrike from the pager we directly spawn and write into. I think it is a good thing to consider, especially because "the default $LESS variable we set" should not affect the pager indirectly triggered by us spawning "man". Thanks. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: \b character escapes in CLI usage 2025-02-26 7:38 ` Jeff King 2025-02-26 8:09 ` Jeff King 2025-02-26 15:59 ` Junio C Hamano @ 2025-02-26 23:36 ` brian m. carlson 2025-02-26 23:55 ` Junio C Hamano ` (2 more replies) 2 siblings, 3 replies; 15+ messages in thread From: brian m. carlson @ 2025-02-26 23:36 UTC (permalink / raw) To: Jeff King; +Cc: Yaakov Smith, git@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 1651 bytes --] On 2025-02-26 at 07:38:22, Jeff King wrote: > On Tue, Feb 25, 2025 at 11:44:33PM +0000, Yaakov Smith wrote: > > When using "git config --list" however, this is emitted in its raw format, and can be used to mask or hide an actual (probably invalid) value: > > > > $ cat .git/config > > [core] > > somevalue = "true\b\b\b\bfalse" > > $ git config --local --list > > core.somevalue=false > > But here, the point of "git config" is to show the output. If we > sanitized it (especially in a lossy way like we do for error messages), > then any program reading the output would not see the real data. Yes, I should point out that, among other programs, Git LFS reads this output. Changing the output format would break those programs. > > Should "git config" be smarter here and print something other than a > > literal backspace to the terminal, like "git fetch" does? > > So I would say no here, in general. I agree this is the right choice in general. I wonder if we might want some sort of human-readable output option that might escape these that users could use. The output might still be machine-readable, but it might be easier to parse than the current format, which has some tricky edge cases when a config value contains newlines. We already have precedent for this in core.quotePath and could easily use similar logic here. That format, while using octal, which I find ugly and hard to read, does have the pleasant side effect that it works correctly with POSIX printf(1) (which I'm sure was intentional), unlike hex escapes. -- brian m. carlson (they/them or he/him) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: \b character escapes in CLI usage 2025-02-26 23:36 ` brian m. carlson @ 2025-02-26 23:55 ` Junio C Hamano 2025-02-27 0:03 ` Junio C Hamano 2025-02-27 16:26 ` \b character escapes in CLI usage Phillip Wood 2 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2025-02-26 23:55 UTC (permalink / raw) To: brian m. carlson; +Cc: Jeff King, Yaakov Smith, git@vger.kernel.org "brian m. carlson" <sandals@crustytoothpaste.net> writes: > We already have precedent for this in core.quotePath and could easily > use similar logic here. That format, while using octal, which I find > ugly and hard to read, does have the pleasant side effect that it works > correctly with POSIX printf(1) (which I'm sure was intentional), unlike > hex escapes. It was intended to be "the normal C quoting": https://lore.kernel.org/git/87ek6s0w34.fsf@penguin.cs.ucla.edu/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: \b character escapes in CLI usage 2025-02-26 23:36 ` brian m. carlson 2025-02-26 23:55 ` Junio C Hamano @ 2025-02-27 0:03 ` Junio C Hamano 2025-02-27 14:06 ` General output formatting (was: Re: \b character escapes in CLI usage) Marc Branchaud 2025-02-27 16:26 ` \b character escapes in CLI usage Phillip Wood 2 siblings, 1 reply; 15+ messages in thread From: Junio C Hamano @ 2025-02-27 0:03 UTC (permalink / raw) To: brian m. carlson; +Cc: Jeff King, Yaakov Smith, git@vger.kernel.org "brian m. carlson" <sandals@crustytoothpaste.net> writes: > I agree this is the right choice in general. I wonder if we might want > some sort of human-readable output option that might escape these that > users could use. > The output might still be machine-readable, ... I wonder if isatty(1) is a good way to say "ah, we are not captured in 'foo=$(git blah)' and not feeding somebody in 'git blah | somebody', so we do not have to worry about being machine readable". If that is a reliable way to tell that we could butcher our output for the sake of keeping the terminal state sane, we then can always do the C-quote escaping, or even information losing '?' redaction. ^ permalink raw reply [flat|nested] 15+ messages in thread
* General output formatting (was: Re: \b character escapes in CLI usage) 2025-02-27 0:03 ` Junio C Hamano @ 2025-02-27 14:06 ` Marc Branchaud 2025-02-27 17:06 ` General output formatting Junio C Hamano 0 siblings, 1 reply; 15+ messages in thread From: Marc Branchaud @ 2025-02-27 14:06 UTC (permalink / raw) To: Junio C Hamano, brian m. carlson Cc: Jeff King, Yaakov Smith, git@vger.kernel.org On 2025-02-26 19:03, Junio C Hamano wrote: > "brian m. carlson" <sandals@crustytoothpaste.net> writes: > >> I agree this is the right choice in general. I wonder if we might want >> some sort of human-readable output option that might escape these that >> users could use. >> The output might still be machine-readable, ... > > I wonder if isatty(1) is a good way to say "ah, we are not captured > in 'foo=$(git blah)' and not feeding somebody in 'git blah | > somebody', so we do not have to worry about being machine readable". > If that is a reliable way to tell that we could butcher our output > for the sake of keeping the terminal state sane, we then can always > do the C-quote escaping, or even information losing '?' redaction. Modern practice seems to be moving towards explicit format options to let code that's parsing output directly specify how it wants to see the data. Such options eliminate the need for isatty() heuristics and other guesswork. For example, the ip command (at least in Ubuntu) accepts -j to format output as JSON. I've found this to be immensely helpful for my scripts. I'm sure Git scripters would appreciate something similar, perhaps as a global "--format=X" option to "git" itself. isatty() heuristics could still be used when no formatting option is specified (though I suspect in the long run the default will end up being terminal-friendly output). This would certainly be a large effort, but once the basic pattern is worked out it could be incrementally implemented one command at a time. M. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: General output formatting 2025-02-27 14:06 ` General output formatting (was: Re: \b character escapes in CLI usage) Marc Branchaud @ 2025-02-27 17:06 ` Junio C Hamano 2025-02-27 17:14 ` Marc Branchaud 0 siblings, 1 reply; 15+ messages in thread From: Junio C Hamano @ 2025-02-27 17:06 UTC (permalink / raw) To: Marc Branchaud Cc: brian m. carlson, Jeff King, Yaakov Smith, git@vger.kernel.org Marc Branchaud <marcnarc@xiplink.com> writes: >> I wonder if isatty(1) is a good way to say "ah, we are not captured >> in 'foo=$(git blah)' and not feeding somebody in 'git blah | >> somebody', so we do not have to worry about being machine readable". >> If that is a reliable way to tell that we could butcher our output >> for the sake of keeping the terminal state sane, we then can always >> do the C-quote escaping, or even information losing '?' redaction. > > Modern practice seems to be moving towards explicit format options to > let code that's parsing output directly specify how it wants to see > the data. Such options eliminate the need for isatty() heuristics and > other guesswork. I am not opposed to an explicit "please avoid raw binary output" or even "please make it even more machine-processable by formatting in yaml" options. What I was hinting at was what the default should be for interactive use when the output goes directly to the eyes of end-users, which is pretty much orthogonal. Thanks. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: General output formatting 2025-02-27 17:06 ` General output formatting Junio C Hamano @ 2025-02-27 17:14 ` Marc Branchaud 2025-02-27 18:40 ` Junio C Hamano 0 siblings, 1 reply; 15+ messages in thread From: Marc Branchaud @ 2025-02-27 17:14 UTC (permalink / raw) To: Junio C Hamano Cc: brian m. carlson, Jeff King, Yaakov Smith, git@vger.kernel.org On 2025-02-27 12:06, Junio C Hamano wrote: > Marc Branchaud <marcnarc@xiplink.com> writes: > >>> I wonder if isatty(1) is a good way to say "ah, we are not captured >>> in 'foo=$(git blah)' and not feeding somebody in 'git blah | >>> somebody', so we do not have to worry about being machine readable". >>> If that is a reliable way to tell that we could butcher our output >>> for the sake of keeping the terminal state sane, we then can always >>> do the C-quote escaping, or even information losing '?' redaction. >> >> Modern practice seems to be moving towards explicit format options to >> let code that's parsing output directly specify how it wants to see >> the data. Such options eliminate the need for isatty() heuristics and >> other guesswork. > > I am not opposed to an explicit "please avoid raw binary output" or > even "please make it even more machine-processable by formatting in > yaml" options. What I was hinting at was what the default should be > for interactive use when the output goes directly to the eyes of > end-users, which is pretty much orthogonal. Sorry, I read "if that is a reliable way to tell" as looking for reliability. I have no opinion on exactly how to "butcher" the output for a terminal. I guess it depends on how well Git wants to support copy/paste of its output. M. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: General output formatting 2025-02-27 17:14 ` Marc Branchaud @ 2025-02-27 18:40 ` Junio C Hamano 0 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2025-02-27 18:40 UTC (permalink / raw) To: Marc Branchaud Cc: brian m. carlson, Jeff King, Yaakov Smith, git@vger.kernel.org Marc Branchaud <marcnarc@xiplink.com> writes: > I have no opinion on exactly how to "butcher" the output for a > terminal. I guess it depends on how well Git wants to support > copy/paste of its output. Yup, that is a fine balancing act. Given the current behaviour, $ cat .git/config [core] somevalue = "true\b\b\b\bfalse" $ git config --local --list core.somevalue=false supporting copy-paste may not be such a good thing to do, though ;-) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: \b character escapes in CLI usage 2025-02-26 23:36 ` brian m. carlson 2025-02-26 23:55 ` Junio C Hamano 2025-02-27 0:03 ` Junio C Hamano @ 2025-02-27 16:26 ` Phillip Wood 2 siblings, 0 replies; 15+ messages in thread From: Phillip Wood @ 2025-02-27 16:26 UTC (permalink / raw) To: brian m. carlson, Jeff King, Yaakov Smith, git@vger.kernel.org On 26/02/2025 23:36, brian m. carlson wrote: > On 2025-02-26 at 07:38:22, Jeff King wrote: >> On Tue, Feb 25, 2025 at 11:44:33PM +0000, Yaakov Smith wrote: >>> >>> Should "git config" be smarter here and print something other than a >>> literal backspace to the terminal, like "git fetch" does? >> >> So I would say no here, in general. > > I agree this is the right choice in general. I wonder if we might want > some sort of human-readable output option that might escape these that > users could use. The output might still be machine-readable, but it > might be easier to parse than the current format, which has some tricky > edge cases when a config value contains newlines. We have '-z' to avoid that ambiguity. I agree that having an option to provide a human-readable output would be a nice addition. Best Wishes Phillip > We already have precedent for this in core.quotePath and could easily > use similar logic here. That format, while using octal, which I find > ugly and hard to read, does have the pleasant side effect that it works > correctly with POSIX printf(1) (which I'm sure was intentional), unlike > hex escapes. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-02-27 18:40 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-02-25 23:44 \b character escapes in CLI usage Yaakov Smith 2025-02-26 7:38 ` Jeff King 2025-02-26 8:09 ` Jeff King 2025-02-26 16:02 ` Junio C Hamano 2025-02-26 16:38 ` Kyle Lippincott 2025-02-26 22:06 ` Jeff King 2025-02-26 15:59 ` Junio C Hamano 2025-02-26 23:36 ` brian m. carlson 2025-02-26 23:55 ` Junio C Hamano 2025-02-27 0:03 ` Junio C Hamano 2025-02-27 14:06 ` General output formatting (was: Re: \b character escapes in CLI usage) Marc Branchaud 2025-02-27 17:06 ` General output formatting Junio C Hamano 2025-02-27 17:14 ` Marc Branchaud 2025-02-27 18:40 ` Junio C Hamano 2025-02-27 16:26 ` \b character escapes in CLI usage Phillip Wood
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox