* Re: [PATCH] gpg-interface: fix strip_cr_before_lf to only remove CR before LF
2026-06-23 8:45 [PATCH] gpg-interface: fix strip_cr_before_lf to only remove CR before LF DSAntonio08
@ 2026-06-23 15:09 ` Junio C Hamano
0 siblings, 0 replies; 2+ messages in thread
From: Junio C Hamano @ 2026-06-23 15:09 UTC (permalink / raw)
To: DSAntonio08; +Cc: git
DSAntonio08 <antonio.destefani08@gmail.com> writes:
> The remove_cr_after() function was stripping all CR characters
> unconditionally, even lone \r not followed by \n. This is incorrect
> as only \r\n sequences (Windows line endings) should be normalized.
>
> Fix the loop condition to skip \r only when immediately followed by
> \n, and rename the function to strip_cr_before_lf to reflect its
> actual behavior. Update both call sites and their comments accordingly.
> ---
A few comments.
- Documentation/SubmittingPatches[[sign-off]] wants you to certify
that this patch is something you have the right to submit to this
project. Sign-off is missing at the end of the proposed log
message.
- Documentation/SubmittingPatches[[real-name]] also prefers to see
us interacting with humans with real-sounding names, not handles.
- When changing design decision made in a fairly ancient code, we
would prefer to see the proposed log message says why the code is
that way, and how and why it is safe to change it.
As to the last point, in this particular case, the NEEDSWORK comment
says "only CRs before LFs", but we must not blindly follow NEEDSWORK
comments. They just mean "somebody may want to rethink the issues
in the future but right now we stop here and leave the code in this
shape."
So, let's see if we can continue their thinking.
This "We should normalize CR/LF into LF but we are too lazy to
bother" originally came from c4adea82 (Convert CR/LF to LF in tag
signatures, 2008-07-11), and then 2f47eae2 (Split GPG interface into
its own helper library, 2011-09-07) moved the code around to make
the part interacting with GPG reusable. Later when SSH signing was
introduced in 29b31577 (ssh signing: add ssh key format and signing
code, 2021-09-10), the "remove CR" was made into a helper function,
and that is what remains today.
Having something like the above paragraph in the proposed commit log
message would have been very good. Further, there might be old
discussion that led to c4adea82 where people may have discussed if
it is sensible to be lazy, which you may further want to dig down to
the root, to make sure that even back then people were aware that
touching only CRLF, not all CRs that appear at random places, was
the right thing and the code was done only due to laziness (rather
than, e.g., leaving lone CRs in the message somehow harms other
parts of the system in a way we are not realizing in this
discussion).
That would make a very good supporting material to convince readers
why the change this patch is making a good idea.
> gpg-interface.c | 18 +++++++-----------
> 1 file changed, 7 insertions(+), 11 deletions(-)
>
> diff --git a/gpg-interface.c b/gpg-interface.c
> index dafd5371fa..87ae6503da 100644
> --- a/gpg-interface.c
> +++ b/gpg-interface.c
> @@ -989,17 +989,13 @@ int sign_buffer(struct strbuf *buffer, struct strbuf *signature,
> free(keyid_to_free);
> return ret;
> }
> -
> -/*
> - * Strip CR from the line endings, in case we are on Windows.
> - * NEEDSWORK: make it trim only CRs before LFs and rename
> - */
> -static void remove_cr_after(struct strbuf *buffer, size_t offset)
> +/* Strip CR before LF from the line endings, in case we are on Windows. */
> +static void strip_cr_before_lf(struct strbuf *buffer, size_t offset)
> {
> size_t i, j;
>
> for (i = j = offset; i < buffer->len; i++) {
> - if (buffer->buf[i] != '\r') {
> + if (buffer->buf[i] != '\r' || (i + 1 < buffer->len && buffer->buf[i + 1] != '\n')) {
Please avoid making lines overly long like this. Wrapping at a
logical break in the expression like this:
if (buffer->buf[i] != '\r' ||
(i + 1 < buffer->len && buffer->buf[i + 1] != '\n')) {
would not just make the line fit in a reasonable width limit, but
makes it easier to follow.
> if (i != j)
> buffer->buf[j] = buffer->buf[i];
> j++;
More importantly, isn't the above slightly buggy? If the buffer
ends with a lone CR (i.e. not followed by LF), your condition:
if (buffer->buf[i] != '\r' || (i + 1 < buffer->len && buffer->buf[i + 1] != '\n'))
will evaluate to false because both operands of the OR are false;
the byte we are looking at is CR so the LHS off OR is false, and
because we are at the end of the buffer, so we do not have the byte
that follows ours that is not LF, which makes RHS of OR also false.
The lone trailing CR will be skipped, instead of getting copied.
If I were writing this, I would have written it more like this:
for (i = j = offset; i < buffer->len; i++) {
if (buffer->buf[i] == '\r' &&
i + 1 < buffer->len && buffer->buf[i + 1] == '\n')
continue;
buffer->buf[j++] = buffer->buf[i];
}
This not only avoids the bug by keeping lone trailing CRs, but
also makes what we are special-casing stand out more clearly
(assignment is the norm, skipping is the exception), and avoids
pushing the assignment too deep in the conditional for readability.
The changes to the callers (due to function name update) below
(ellided) both looked fine.
Thanks.
^ permalink raw reply [flat|nested] 2+ messages in thread