From: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
To: Junio C Hamano <gitster@pobox.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH] log --format: document %w
Date: Mon, 23 Nov 2009 00:46:34 +0100 [thread overview]
Message-ID: <4B09CD5A.4070401@lsrfire.ath.cx> (raw)
In-Reply-To: <7vzl6eiiyx.fsf@alter.siamese.dyndns.org>
Junio C Hamano schrieb:
> René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
>
>> I'm not especially proud of the triple negative in that note. How to say it
>> better, yet concise?
>> +- '%w([<w>[,<i1>[,<i2>]]])': switch line wrapping, like the -w option of
>> + linkgit:git-shortlog[1]. NOTE: Color placeholders (`%C*`) are not
>> + recognized as having no width, so they should not be put into wrapped
>> + sections.
>
> "The code miscounts the width of '%C*' color placeholders"?
>
> Perhaps somebody in the codepath leading to pick_one_utf8_char() in utf8.c
> can be made aware of them?
>
> utf8_width() is called from many places (has one caller outside utf8.c as
> well). It is given a pointer to a pointer that points at the current
> position in a string, and is responsible for picking up one logical letter
> advancing the given pointer to skip over that letter, and returning the
> display width of that one letter. The function wants the string to be
> encoded in utf-8 and signals by putting NULL in the pointer when it
> detects the input string is not.
>
> Picking up one logical letter is done by pick_one_utf8_char(), which is a
> nicely written generic "We are at the character boundary of a potentially
> multi-byte utf-8 string; pick the first character" implementation, and we
> wouldn't want to contaminate that with escape sequence logic---we might
> want to reuse it in other codepaths where we have no reason to expect any
> escape sequences.
>
> So perhaps we can introduce is_esc_sequence(s, r, w) that
>
> - returns true if we are at the beginning of an esc-sequence;
> - skips the sequence just like utf8_width() does with s and r; and
> - counts the width of the sequence and returns it in *w
>
> The implementation of the is_esc_sequence() could be to only detect the
> color sequence (if the sequence has things like cursor-position control
> then we are already lost, as calling "utf8_width()" on such a string does
> not make much sense anyway) and report zero-width.
>
> I dunno.
>
> diff --git a/utf8.c b/utf8.c
> index 5c18f0c..d45e75f 100644
> --- a/utf8.c
> +++ b/utf8.c
> @@ -241,7 +241,12 @@ invalid:
> */
> int utf8_width(const char **start, size_t *remainder_p)
> {
> - ucs_char_t ch = pick_one_utf8_char(start, remainder_p);
> + ucs_char_t ch;
> + int w;
> +
> + if (is_esc_sequence(start, remainder_p, &w))
> + return w;
> + ch = pick_one_utf8_char(start, remainder_p);
> if (!*start)
> return 0;
> return git_wcwidth(ch);
I think utf8_width() is too generic for that; we shouldn't teach it terminal
control details. Something like this? It keeps it all local to
strbuf_add_wrapped_text(); ignoring display mode escape codes in there can be
justified with its purpose.
utf8.c | 23 ++++++++++++++++++++++-
1 files changed, 22 insertions(+), 1 deletions(-)
diff --git a/utf8.c b/utf8.c
index 5c18f0c..fcc0aeb 100644
--- a/utf8.c
+++ b/utf8.c
@@ -298,6 +298,21 @@ static void print_spaces(struct strbuf *buf, int count)
strbuf_write(buf, s, count);
}
+/* XXX: this handles display mode sequences, only. Do we need more? */
+static size_t esc_sequence_len(const char *s)
+{
+ const char *p = s;
+ if (*p++ != '\033')
+ return 0;
+ if (*p++ != '[')
+ return 0;
+ while (isdigit(*p) || *p == ';')
+ p++;
+ if (*p++ != 'm')
+ return 0;
+ return p - s;
+}
+
/*
* Wrap the text, if necessary. The variable indent is the indent for the
* first line, indent2 is the indent for all other lines.
@@ -329,7 +344,13 @@ int strbuf_add_wrapped_text(struct strbuf *buf,
}
for (;;) {
- char c = *text;
+ char c;
+ size_t skip;
+
+ while ((skip = esc_sequence_len(text)))
+ text += skip;
+
+ c = *text;
if (!c || isspace(c)) {
if (w < width || !space) {
const char *start = bol;
next prev parent reply other threads:[~2009-11-22 23:46 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-22 16:15 [PATCH] log --format: document %w René Scharfe
2009-11-22 17:10 ` Junio C Hamano
2009-11-22 23:46 ` René Scharfe [this message]
2009-11-23 0:06 ` Junio C Hamano
2009-11-23 22:40 ` [PATCH] strbuf_add_wrapped_text(): skip over colour codes René Scharfe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B09CD5A.4070401@lsrfire.ath.cx \
--to=rene.scharfe@lsrfire.ath.cx \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.