From: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
To: Junio C Hamano <gitster@pobox.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH] log --format: document %w
Date: Mon, 23 Nov 2009 00:46:34 +0100 [thread overview]
Message-ID: <4B09CD5A.4070401@lsrfire.ath.cx> (raw)
In-Reply-To: <7vzl6eiiyx.fsf@alter.siamese.dyndns.org>
Junio C Hamano schrieb:
> René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
>
>> I'm not especially proud of the triple negative in that note. How to say it
>> better, yet concise?
>> +- '%w([<w>[,<i1>[,<i2>]]])': switch line wrapping, like the -w option of
>> + linkgit:git-shortlog[1]. NOTE: Color placeholders (`%C*`) are not
>> + recognized as having no width, so they should not be put into wrapped
>> + sections.
>
> "The code miscounts the width of '%C*' color placeholders"?
>
> Perhaps somebody in the codepath leading to pick_one_utf8_char() in utf8.c
> can be made aware of them?
>
> utf8_width() is called from many places (has one caller outside utf8.c as
> well). It is given a pointer to a pointer that points at the current
> position in a string, and is responsible for picking up one logical letter
> advancing the given pointer to skip over that letter, and returning the
> display width of that one letter. The function wants the string to be
> encoded in utf-8 and signals by putting NULL in the pointer when it
> detects the input string is not.
>
> Picking up one logical letter is done by pick_one_utf8_char(), which is a
> nicely written generic "We are at the character boundary of a potentially
> multi-byte utf-8 string; pick the first character" implementation, and we
> wouldn't want to contaminate that with escape sequence logic---we might
> want to reuse it in other codepaths where we have no reason to expect any
> escape sequences.
>
> So perhaps we can introduce is_esc_sequence(s, r, w) that
>
> - returns true if we are at the beginning of an esc-sequence;
> - skips the sequence just like utf8_width() does with s and r; and
> - counts the width of the sequence and returns it in *w
>
> The implementation of the is_esc_sequence() could be to only detect the
> color sequence (if the sequence has things like cursor-position control
> then we are already lost, as calling "utf8_width()" on such a string does
> not make much sense anyway) and report zero-width.
>
> I dunno.
>
> diff --git a/utf8.c b/utf8.c
> index 5c18f0c..d45e75f 100644
> --- a/utf8.c
> +++ b/utf8.c
> @@ -241,7 +241,12 @@ invalid:
> */
> int utf8_width(const char **start, size_t *remainder_p)
> {
> - ucs_char_t ch = pick_one_utf8_char(start, remainder_p);
> + ucs_char_t ch;
> + int w;
> +
> + if (is_esc_sequence(start, remainder_p, &w))
> + return w;
> + ch = pick_one_utf8_char(start, remainder_p);
> if (!*start)
> return 0;
> return git_wcwidth(ch);
I think utf8_width() is too generic for that; we shouldn't teach it terminal
control details. Something like this? It keeps it all local to
strbuf_add_wrapped_text(); ignoring display mode escape codes in there can be
justified with its purpose.
utf8.c | 23 ++++++++++++++++++++++-
1 files changed, 22 insertions(+), 1 deletions(-)
diff --git a/utf8.c b/utf8.c
index 5c18f0c..fcc0aeb 100644
--- a/utf8.c
+++ b/utf8.c
@@ -298,6 +298,21 @@ static void print_spaces(struct strbuf *buf, int count)
strbuf_write(buf, s, count);
}
+/* XXX: this handles display mode sequences, only. Do we need more? */
+static size_t esc_sequence_len(const char *s)
+{
+ const char *p = s;
+ if (*p++ != '\033')
+ return 0;
+ if (*p++ != '[')
+ return 0;
+ while (isdigit(*p) || *p == ';')
+ p++;
+ if (*p++ != 'm')
+ return 0;
+ return p - s;
+}
+
/*
* Wrap the text, if necessary. The variable indent is the indent for the
* first line, indent2 is the indent for all other lines.
@@ -329,7 +344,13 @@ int strbuf_add_wrapped_text(struct strbuf *buf,
}
for (;;) {
- char c = *text;
+ char c;
+ size_t skip;
+
+ while ((skip = esc_sequence_len(text)))
+ text += skip;
+
+ c = *text;
if (!c || isspace(c)) {
if (w < width || !space) {
const char *start = bol;
next prev parent reply other threads:[~2009-11-22 23:46 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-22 16:15 [PATCH] log --format: document %w René Scharfe
2009-11-22 17:10 ` Junio C Hamano
2009-11-22 23:46 ` René Scharfe [this message]
2009-11-23 0:06 ` Junio C Hamano
2009-11-23 22:40 ` [PATCH] strbuf_add_wrapped_text(): skip over colour codes René Scharfe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B09CD5A.4070401@lsrfire.ath.cx \
--to=rene.scharfe@lsrfire.ath.cx \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox