* [PATCH] branch -v: align even when the first column is in UTF-8
@ 2012-08-24 14:17 Nguyễn Thái Ngọc Duy
2012-08-24 17:25 ` Junio C Hamano
2012-08-25 10:48 ` [PATCH] branch -v: align even when the first column is " Erik Faye-Lund
0 siblings, 2 replies; 7+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-08-24 14:17 UTC (permalink / raw)
To: git; +Cc: Nguyễn Thái Ngọc Duy
Branch names are usually in ASCII so they are not the problem. The
problem most likely comes from "(no branch)" translation, which is in
UTF-8 and makes length calculation just wrong.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
So far all git translations are utf-8 compatible. Branch names may
use filesystem encoding, but then packed-refs specifies no encoding.
Anyway branch names should be in utf-8.. at least internally, imo.
builtin/branch.c | 8 +++++---
1 tập tin đã bị thay đổi, 5 được thêm vào(+), 3 bị xóa(-)
diff --git a/builtin/branch.c b/builtin/branch.c
index 0e060f2..7c1ffa8 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -17,6 +17,7 @@
#include "revision.h"
#include "string-list.h"
#include "column.h"
+#include "utf8.h"
static const char * const builtin_branch_usage[] = {
"git branch [options] [-r | -a] [--merged | --no-merged]",
@@ -490,11 +491,12 @@ static void print_ref_item(struct ref_item *item, int maxwidth, int verbose,
}
strbuf_addf(&name, "%s%s", prefix, item->name);
- if (verbose)
+ if (verbose) {
+ int utf8_compensation = strlen(name.buf) - utf8_strwidth(name.buf);
strbuf_addf(&out, "%c %s%-*s%s", c, branch_get_color(color),
- maxwidth, name.buf,
+ maxwidth + utf8_compensation, name.buf,
branch_get_color(BRANCH_COLOR_RESET));
- else
+ } else
strbuf_addf(&out, "%c %s%s%s", c, branch_get_color(color),
name.buf, branch_get_color(BRANCH_COLOR_RESET));
--
1.7.12.rc2.18.g61b472e
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] branch -v: align even when the first column is in UTF-8
2012-08-24 14:17 [PATCH] branch -v: align even when the first column is in UTF-8 Nguyễn Thái Ngọc Duy
@ 2012-08-24 17:25 ` Junio C Hamano
2012-08-25 18:17 ` [PATCH v2] branch -v: align even when branch names are " Nguyễn Thái Ngọc Duy
2012-08-25 10:48 ` [PATCH] branch -v: align even when the first column is " Erik Faye-Lund
1 sibling, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2012-08-24 17:25 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git
Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
> Branch names are usually in ASCII so they are not the problem. The
> problem most likely comes from "(no branch)" translation, which is in
> UTF-8 and makes length calculation just wrong.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
> So far all git translations are utf-8 compatible. Branch names may
> use filesystem encoding, but then packed-refs specifies no encoding.
> Anyway branch names should be in utf-8.. at least internally, imo.
I agree with all of the above, but shouldn't you be computing the
"maxwidth" based on the strwidth in the first place? The use of
maxwidth in strbuf_addf() here clearly wants "we know N columns is
sufficient to show all output items, so pad the string to N columns"
here. Looking for assignment "item.len = xxx" in the same file
shows these are computed as byte length, so you are offsetting off
of an incorrectly computed value.
Giving fewer padding bytes when showing a string that will occupy
fewer columns than it has bytes is independently necessary, once we
have the correct maxwidth that is computed in terms of the strwidth,
so this patch is not wrong per-se, but it is incomplete without a
correct maxwidth, no?
> builtin/branch.c | 8 +++++---
> 1 tập tin đã bị thay đổi, 5 được thêm vào(+), 3 bị xóa(-)
>
> diff --git a/builtin/branch.c b/builtin/branch.c
> index 0e060f2..7c1ffa8 100644
> --- a/builtin/branch.c
> +++ b/builtin/branch.c
> @@ -17,6 +17,7 @@
> #include "revision.h"
> #include "string-list.h"
> #include "column.h"
> +#include "utf8.h"
>
> static const char * const builtin_branch_usage[] = {
> "git branch [options] [-r | -a] [--merged | --no-merged]",
> @@ -490,11 +491,12 @@ static void print_ref_item(struct ref_item *item, int maxwidth, int verbose,
> }
>
> strbuf_addf(&name, "%s%s", prefix, item->name);
> - if (verbose)
> + if (verbose) {
> + int utf8_compensation = strlen(name.buf) - utf8_strwidth(name.buf);
> strbuf_addf(&out, "%c %s%-*s%s", c, branch_get_color(color),
> - maxwidth, name.buf,
> + maxwidth + utf8_compensation, name.buf,
> branch_get_color(BRANCH_COLOR_RESET));
> - else
> + } else
> strbuf_addf(&out, "%c %s%s%s", c, branch_get_color(color),
> name.buf, branch_get_color(BRANCH_COLOR_RESET));
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] branch -v: align even when the first column is in UTF-8
2012-08-24 14:17 [PATCH] branch -v: align even when the first column is in UTF-8 Nguyễn Thái Ngọc Duy
2012-08-24 17:25 ` Junio C Hamano
@ 2012-08-25 10:48 ` Erik Faye-Lund
2012-08-25 11:19 ` Nguyen Thai Ngoc Duy
2012-08-26 18:28 ` Junio C Hamano
1 sibling, 2 replies; 7+ messages in thread
From: Erik Faye-Lund @ 2012-08-25 10:48 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git
On Fri, Aug 24, 2012 at 4:17 PM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> 1 tập tin đã bị thay đổi, 5 được thêm vào(+), 3 bị xóa(-)
Huh?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] branch -v: align even when the first column is in UTF-8
2012-08-25 10:48 ` [PATCH] branch -v: align even when the first column is " Erik Faye-Lund
@ 2012-08-25 11:19 ` Nguyen Thai Ngoc Duy
2012-08-26 18:28 ` Junio C Hamano
1 sibling, 0 replies; 7+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-08-25 11:19 UTC (permalink / raw)
To: kusmabite; +Cc: git
2012/8/25 Erik Faye-Lund <kusmabite@gmail.com>:
> On Fri, Aug 24, 2012 at 4:17 PM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
>> 1 tập tin đã bị thay đổi, 5 được thêm vào(+), 3 bị xóa(-)
>
> Huh?
Oh, that's something we should fix soon, too. I suggested a "project
language" config key some time ago, where commands like format-patch
should follow. It's hard because only part of the format-patch will be
in this language while error messages for example are still in $LANG.
I've been thinking, but I haven't found anything worth mentioning yet.
--
Duy
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2] branch -v: align even when branch names are in UTF-8
2012-08-24 17:25 ` Junio C Hamano
@ 2012-08-25 18:17 ` Nguyễn Thái Ngọc Duy
2012-08-26 18:04 ` Junio C Hamano
0 siblings, 1 reply; 7+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-08-25 18:17 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
Branch names are usually in ASCII so they are not the problem. The
problem most likely comes from "(no branch)" translation, which is in
UTF-8 and makes length calculation just wrong.
Update document to mention the fact that we may want ref names in
UTF-8. Encodings that produce invalid UTF-8 are safe as utf8_strwidth()
falls back to strlen(). The ones that incidentally produce valid UTF-8
sequences will cause misalignment.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
On Sat, Aug 25, 2012 at 12:25 AM, Junio C Hamano <gitster@pobox.com> wrote:
> I agree with all of the above, but shouldn't you be computing the
> "maxwidth" based on the strwidth in the first place? The use of
> maxwidth in strbuf_addf() here clearly wants "we know N columns is
> sufficient to show all output items, so pad the string to N columns"
> here. Looking for assignment "item.len = xxx" in the same file
> shows these are computed as byte length, so you are offsetting off
> of an incorrectly computed value.
>
> Giving fewer padding bytes when showing a string that will occupy
> fewer columns than it has bytes is independently necessary, once we
> have the correct maxwidth that is computed in terms of the strwidth,
> so this patch is not wrong per-se, but it is incomplete without a
> correct maxwidth, no?
Yes. This fixes that and also mentions about ref names in utf-8.
Documentation/revisions.txt | 2 ++
builtin/branch.c | 12 +++++++-----
2 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt
index dc0070b..175d397 100644
--- a/Documentation/revisions.txt
+++ b/Documentation/revisions.txt
@@ -55,6 +55,8 @@ when you run `git cherry-pick`.
+
Note that any of the 'refs/*' cases above may come either from
the '$GIT_DIR/refs' directory or from the '$GIT_DIR/packed-refs' file.
+While the ref name encoding is unspecified, UTF-8 is prefered as
+some output processing may assume ref names in UTF-8.
'<refname>@\{<date>\}', e.g. 'master@\{yesterday\}', 'HEAD@\{5 minutes ago\}'::
A ref followed by the suffix '@' with a date specification
diff --git a/builtin/branch.c b/builtin/branch.c
index 0e060f2..73ff7e7 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -17,6 +17,7 @@
#include "revision.h"
#include "string-list.h"
#include "column.h"
+#include "utf8.h"
static const char * const builtin_branch_usage[] = {
"git branch [options] [-r | -a] [--merged | --no-merged]",
@@ -354,7 +355,7 @@ static int append_ref(const char *refname, const unsigned char *sha1, int flags,
newitem->name = xstrdup(refname);
newitem->kind = kind;
newitem->commit = commit;
- newitem->len = strlen(refname);
+ newitem->len = utf8_strwidth(refname);
newitem->dest = resolve_symref(orig_refname, prefix);
/* adjust for "remotes/" */
if (newitem->kind == REF_REMOTE_BRANCH &&
@@ -490,11 +491,12 @@ static void print_ref_item(struct ref_item *item, int maxwidth, int verbose,
}
strbuf_addf(&name, "%s%s", prefix, item->name);
- if (verbose)
+ if (verbose) {
+ int utf8_compensation = strlen(name.buf) - utf8_strwidth(name.buf);
strbuf_addf(&out, "%c %s%-*s%s", c, branch_get_color(color),
- maxwidth, name.buf,
+ maxwidth + utf8_compensation, name.buf,
branch_get_color(BRANCH_COLOR_RESET));
- else
+ } else
strbuf_addf(&out, "%c %s%s%s", c, branch_get_color(color),
name.buf, branch_get_color(BRANCH_COLOR_RESET));
@@ -533,7 +535,7 @@ static void show_detached(struct ref_list *ref_list)
if (head_commit && is_descendant_of(head_commit, ref_list->with_commit)) {
struct ref_item item;
item.name = xstrdup(_("(no branch)"));
- item.len = strlen(item.name);
+ item.len = utf8_strwidth(item.name);
item.kind = REF_LOCAL_BRANCH;
item.dest = NULL;
item.commit = head_commit;
--
1.7.12.rc2.18.g61b472e
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2] branch -v: align even when branch names are in UTF-8
2012-08-25 18:17 ` [PATCH v2] branch -v: align even when branch names are " Nguyễn Thái Ngọc Duy
@ 2012-08-26 18:04 ` Junio C Hamano
0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2012-08-26 18:04 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git
Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
> Branch names are usually in ASCII so they are not the problem. The
> problem most likely comes from "(no branch)" translation, which is in
> UTF-8 and makes length calculation just wrong.
>
> Update document to mention the fact that we may want ref names in
> UTF-8. Encodings that produce invalid UTF-8 are safe as utf8_strwidth()
> falls back to strlen(). The ones that incidentally produce valid UTF-8
> sequences will cause misalignment.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
> ...
> @@ -533,7 +535,7 @@ static void show_detached(struct ref_list *ref_list)
> if (head_commit && is_descendant_of(head_commit, ref_list->with_commit)) {
> struct ref_item item;
> item.name = xstrdup(_("(no branch)"));
> - item.len = strlen(item.name);
> + item.len = utf8_strwidth(item.name);
> item.kind = REF_LOCAL_BRANCH;
> item.dest = NULL;
> item.commit = head_commit;
We should probably rename the "len" field, as it is no longer about
the length (i.e. that which strlen() returns); it is the display
width, and is better called "cols", "width" or somesuch.
I'll squash-in the following.
Thanks.
builtin/branch.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git c/builtin/branch.c w/builtin/branch.c
index 73ff7e7..4ec556f 100644
--- c/builtin/branch.c
+++ w/builtin/branch.c
@@ -250,7 +250,7 @@ static int delete_branches(int argc, const char **argv, int force, int kinds,
struct ref_item {
char *name;
char *dest;
- unsigned int kind, len;
+ unsigned int kind, width;
struct commit *commit;
};
@@ -355,14 +355,14 @@ static int append_ref(const char *refname, const unsigned char *sha1, int flags,
newitem->name = xstrdup(refname);
newitem->kind = kind;
newitem->commit = commit;
- newitem->len = utf8_strwidth(refname);
+ newitem->width = utf8_strwidth(refname);
newitem->dest = resolve_symref(orig_refname, prefix);
/* adjust for "remotes/" */
if (newitem->kind == REF_REMOTE_BRANCH &&
ref_list->kinds != REF_REMOTE_BRANCH)
- newitem->len += 8;
- if (newitem->len > ref_list->maxwidth)
- ref_list->maxwidth = newitem->len;
+ newitem->width += 8;
+ if (newitem->width > ref_list->maxwidth)
+ ref_list->maxwidth = newitem->width;
return 0;
}
@@ -521,8 +521,8 @@ static int calc_maxwidth(struct ref_list *refs)
for (i = 0; i < refs->index; i++) {
if (!matches_merge_filter(refs->list[i].commit))
continue;
- if (refs->list[i].len > w)
- w = refs->list[i].len;
+ if (refs->list[i].width > w)
+ w = refs->list[i].width;
}
return w;
}
@@ -535,12 +535,12 @@ static void show_detached(struct ref_list *ref_list)
if (head_commit && is_descendant_of(head_commit, ref_list->with_commit)) {
struct ref_item item;
item.name = xstrdup(_("(no branch)"));
- item.len = utf8_strwidth(item.name);
+ item.width = utf8_strwidth(item.name);
item.kind = REF_LOCAL_BRANCH;
item.dest = NULL;
item.commit = head_commit;
- if (item.len > ref_list->maxwidth)
- ref_list->maxwidth = item.len;
+ if (item.width > ref_list->maxwidth)
+ ref_list->maxwidth = item.width;
print_ref_item(&item, ref_list->maxwidth, ref_list->verbose, ref_list->abbrev, 1, "");
free(item.name);
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] branch -v: align even when the first column is in UTF-8
2012-08-25 10:48 ` [PATCH] branch -v: align even when the first column is " Erik Faye-Lund
2012-08-25 11:19 ` Nguyen Thai Ngoc Duy
@ 2012-08-26 18:28 ` Junio C Hamano
1 sibling, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2012-08-26 18:28 UTC (permalink / raw)
To: kusmabite; +Cc: Nguyễn Thái Ngọc Duy, git
Erik Faye-Lund <kusmabite@gmail.com> writes:
> On Fri, Aug 24, 2012 at 4:17 PM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
>> 1 tập tin đã bị thay đổi, 5 được thêm vào(+), 3 bị xóa(-)
>
> Huh?
Perhaps format-patch should always use C locale.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-08-26 18:29 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-24 14:17 [PATCH] branch -v: align even when the first column is in UTF-8 Nguyễn Thái Ngọc Duy
2012-08-24 17:25 ` Junio C Hamano
2012-08-25 18:17 ` [PATCH v2] branch -v: align even when branch names are " Nguyễn Thái Ngọc Duy
2012-08-26 18:04 ` Junio C Hamano
2012-08-25 10:48 ` [PATCH] branch -v: align even when the first column is " Erik Faye-Lund
2012-08-25 11:19 ` Nguyen Thai Ngoc Duy
2012-08-26 18:28 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).