All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] utf8.c: fix strbuf_utf8_replace copying the last NUL to dst string
Date: Tue, 29 Jul 2014 12:56:24 -0700	[thread overview]
Message-ID: <xmqqha202b2v.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1406639429-28748-1-git-send-email-pclouds@gmail.com> ("Nguyễn	Thái Ngọc Duy"'s message of "Tue, 29 Jul 2014 20:10:29 +0700")

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> When utf8_width(&src) is called with *src == NULL (because the
> source string ends with an ansi sequence),

I am not sure what you mean by "because" here.  Do you mean somebody
(who?) decides to call the utf8_width() with NULL pointer stored in
*src because of "ansi sequence"?

What do you mean by "ansi sequence"?  I'll assume that you mean
those terminal control that all use bytes with hi-bit clear.

At the very beginning of utf8_width(), *start can be cleared to
point at a NULL pointer by pick_one_utf8_char() if the pointer that
comes into utf8_width() originally points at an invalid UTF-8
string, but as far as I can see, ESC (or any bytes that would be
used in those terminal control sequences like colors and cursor
control) will simply be returned as a single byte, without going
into error path that clears *start = NULL.

Puzzled...

> it returns 0 and steps
> 'src' by one. 

Here "it" refers to utf8_width()?  Who steps 'src' by one?

Ahh, did you mean *src == NUL, i.e. "already at the end of the
string"?

I think utf8_width() called with an empty string should not move the
pointer past that end-of-string NUL in the first place.  It makes me
wonder if it would be a better fix to make it not to do that (and
return 0), but if we declare it is the caller's fault, perhaps we
may want to add

	if (!**start)
        	die("BUG: empty string to utf8_width()???");

at the very beginning of utf8_width(), even before it calls
pick-one-utf8-char.

Still puzzled...

> This stepping makes strbuf_utf8_replace add NUL to the
> destination string at the end of the loop. Check and break the loop
> early.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  utf8.c | 2 ++
>  1 file changed, 2 insertions(+)

Tests?

> diff --git a/utf8.c b/utf8.c
> index b30790d..cd090a1 100644
> --- a/utf8.c
> +++ b/utf8.c
> @@ -381,6 +381,8 @@ void strbuf_utf8_replace(struct strbuf *sb_src, int pos, int width,
>  			src += n;
>  			dst += n;
>  		}
> +		if (src >= end)
> +			break;
>  
>  		old = src;
>  		n = utf8_width((const char**)&src, NULL);

  reply	other threads:[~2014-07-29 19:56 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-29 13:10 [PATCH] utf8.c: fix strbuf_utf8_replace copying the last NUL to dst string Nguyễn Thái Ngọc Duy
2014-07-29 19:56 ` Junio C Hamano [this message]
2014-07-30 10:20   ` Duy Nguyen
2014-07-30 18:23     ` Junio C Hamano
2014-08-10  7:05 ` [PATCH v2] utf8.c: fix strbuf_utf8_replace() consuming data beyond input string Nguyễn Thái Ngọc Duy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqha202b2v.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.