All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Phillip Wood <phillip.wood123@gmail.com>
Cc: Toon Claes <toon@iotcl.com>,
	git@vger.kernel.org, Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH v5 1/1] cat-file: quote-format name in error when using -z
Date: Mon, 15 May 2023 10:20:51 -0700	[thread overview]
Message-ID: <xmqqmt25a4uk.fsf@gitster.g> (raw)
In-Reply-To: <ec139b78-1d36-f894-e39f-f29877a67b18@gmail.com> (Phillip Wood's message of "Mon, 15 May 2023 09:47:38 +0100")

Phillip Wood <phillip.wood123@gmail.com> writes:

> On 12/05/2023 17:57, Junio C Hamano wrote:
>> Toon Claes <toon@iotcl.com> writes:
>> Stepping back a bit, how big a problem is this in real life?  It
>> certainly is possible to create a pathname with funny byte values in
>> it, and in some environments,letters like single-quote that are
>> considered cumbersome to handle by those who are used to CLI
>> programs may be commonplace.  But a path with newline?  Or any
>> control character for that matter?  And this is not even the primary
>> output from the program but is an error message for consumption by
>> humans, no?
>> I am wondering if it is simpler to just declare that the paths
>> output in error messages have certain bytes, probably all control
>> characters other than HT, replaced with a dot, and tell the users
>> not to rely on the pathnames being intact if they contain funny
>> bytes in them.
>
> We could only c-quote the name when it contains a control character
> other that HT. That way names containing double quotes and backslashes
> are unchanged but it will still be possible to parse the path from the
> error message. If we're going to munge the name we might as well use
> our standard quoting rather than some ad-hoc scheme.

In the above suggestion, I gave up and no longer aim to do
"quoting".  A more appropriate word for the approach is "redacting".
The message essentially is: If you use truly problematic bytes in
your path, they are redacted (so do not use them if it hurts).

This is because I am not sure how "names containing dq and bs are
unchanged" can be done without ambiguity.  If I see a message that
comes out of this:

	printf("%s missing\n", obj_name);

and it looks like

	"a\nb" missing

how do I tell if it is complaining about the object the user named
with a three-byte string (i.e. lowercase-A, newline, lowercase-B),
or a six-byte string (i.e. dq, lowercase-A, bs, lowercase-N,
lowercase-B, dq)?

If we were forbidding '"' to appear in a refname, then we could take
advantage of the fact that the name of an object inside a tree at a
funny path would not start with '"', to disambiguate.  For the
three- and six-byte string cases above, the formatting function will
give these messages (referred to as "sample output" below):

	"master:a\nb" missing
	master:"a\nb" missing

because of your "we do not exactly do our standard c-quote; we
exempt dq and bs from the bytes to be quoted" rule.

But it still feels a bit misleading.  This codepath may have the
whole objectname as a single string so that c-quoting the entire
"<commit> <colon> <path>" inside a single c-quoted string that
begins with a dq is easy, but not all codepaths are lucky and some
may have to show <commit> and <path> separately, concatenated with
<colon> at the outermost output layer, which means that the second
one from the sample output may still mean the path with three-byte
name in the tree of 'master' commit.

And worse yet, because

	git branch '"master'

is possible (even though nobody sane would do that), so "treat the
string as c-quoted only if the object name as a whole begins with a
dq", this disambiguation idea would not work.  The first one from
the sample output could be the blob at the path with a five-byte
string name (i.e. lowercase-A, bs, lowercase-N, lowercase-B, dq)
in the tree of the commit at the tip of branch with seven-byte
string name (i.e. dq followed by 'master').

So, I dunno.

  reply	other threads:[~2023-05-15 17:23 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-09 15:00 [PATCH 0/1] cat-file: quote-format name in error when using -z Toon Claes
2022-12-09 15:00 ` [PATCH 1/1] " Toon Claes
2022-12-09 19:33   ` Phillip Wood
2022-12-09 23:58     ` Junio C Hamano
2022-12-11 16:30       ` Phillip Wood
2022-12-12  0:11         ` Junio C Hamano
2022-12-12 11:34           ` Toon Claes
2022-12-12 22:09             ` Junio C Hamano
2022-12-13 15:06           ` Phillip Wood
2022-12-14  8:29             ` Junio C Hamano
2022-12-20  5:31     ` Toon Claes
2022-12-20 10:18       ` Phillip Wood
2022-12-21 12:42         ` Toon Claes
2023-01-05  6:24 ` [PATCH v2 0/1] " Toon Claes
2023-01-05  6:24   ` [PATCH v2 1/1] " Toon Claes
2023-01-16 19:07   ` [PATCH v3 0/1] " Toon Claes
2023-01-16 19:07     ` [PATCH v3 1/1] " Toon Claes
2023-01-17 15:24       ` Phillip Wood
2023-03-03 19:17     ` [PATCH v4 0/2] " Toon Claes
2023-03-03 19:17       ` [PATCH v4 1/2] cat-file: extract printing batch error message into function Toon Claes
2023-03-03 20:26         ` Junio C Hamano
2023-03-03 23:14           ` Junio C Hamano
2023-05-10 19:01             ` [PATCH v5 0/1] cat-file: quote-format name in error when using -z Toon Claes
2023-05-10 19:01               ` [PATCH v5 1/1] " Toon Claes
2023-05-10 20:13                 ` Junio C Hamano
2023-05-12  8:54                   ` Toon Claes
2023-05-12 16:57                     ` Junio C Hamano
2023-05-15  8:47                       ` Phillip Wood
2023-05-15 17:20                         ` Junio C Hamano [this message]
2023-06-02 13:29                           ` Phillip Wood
2023-03-03 19:17       ` [PATCH v4 2/2] " Toon Claes
2023-03-03 20:14       ` [PATCH v4 0/2] " Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqmt25a4uk.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=phillip.wood123@gmail.com \
    --cc=toon@iotcl.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.