From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [PATCH 08/12] pretty: two phase conversion for non utf-8 commits
Date: Sat, 16 Mar 2013 09:24:39 +0700 [thread overview]
Message-ID: <1363400683-14813-9-git-send-email-pclouds@gmail.com> (raw)
In-Reply-To: <1363400683-14813-1-git-send-email-pclouds@gmail.com>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 4644 bytes --]
Always assume format_commit_item() takes an utf-8 string for string
handling simplicity (we can handle utf-8 strings, but can't with other
encodings).
If commit message is in non-utf8, or output encoding is not, then the
commit is first converted to utf-8, processed, then output converted
to output encoding. This of course only works with encodings that are
compatible with Unicode.
This also fixes the iso8859-1 test in t6006. It's supposed to create
an iso8859-1 commit, but the commit content in t6006 is in UTF-8.
t6006 is now converted back in UTF-8 (the downside is we can't put
utf-8 strings there anymore).
Signed-off-by: Nguyá»
n Thái Ngá»c Duy <pclouds@gmail.com>
---
pretty.c | 24 ++++++++++++++++++++++--
t/t6006-rev-list-format.sh | 12 ++++++------
2 files changed, 28 insertions(+), 8 deletions(-)
diff --git a/pretty.c b/pretty.c
index 092dd1d..3f4809a 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1003,7 +1003,8 @@ static int format_reflog_person(struct strbuf *sb,
return format_person_part(sb, part, ident, strlen(ident), dmode);
}
-static size_t format_commit_one(struct strbuf *sb, const char *placeholder,
+static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
+ const char *placeholder,
void *context)
{
struct format_commit_context *c = context;
@@ -1235,7 +1236,8 @@ static size_t format_commit_one(struct strbuf *sb, const char *placeholder,
return 0; /* unknown placeholder */
}
-static size_t format_commit_item(struct strbuf *sb, const char *placeholder,
+static size_t format_commit_item(struct strbuf *sb, /* in UTF-8 */
+ const char *placeholder,
void *context)
{
int consumed;
@@ -1315,6 +1317,7 @@ void format_commit_message(const struct commit *commit,
{
struct format_commit_context context;
const char *output_enc = pretty_ctx->output_encoding;
+ const char *utf8 = "UTF-8";
memset(&context, 0, sizeof(context));
context.commit = commit;
@@ -1327,6 +1330,23 @@ void format_commit_message(const struct commit *commit,
strbuf_expand(sb, format, format_commit_item, &context);
rewrap_message_tail(sb, &context, 0, 0, 0);
+ if (output_enc) {
+ if (same_encoding(utf8, output_enc))
+ output_enc = NULL;
+ } else {
+ if (context.commit_encoding &&
+ !same_encoding(context.commit_encoding, utf8))
+ output_enc = context.commit_encoding;
+ }
+
+ if (output_enc) {
+ int outsz;
+ char *out = reencode_string(sb->buf, sb->len,
+ output_enc, utf8, &outsz);
+ if (out)
+ strbuf_attach(sb, out, outsz, outsz + 1);
+ }
+
free(context.commit_encoding);
logmsg_free(context.message, commit);
free(context.signature.gpg_output);
diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh
index 3fc3b74..0393c9f 100755
--- a/t/t6006-rev-list-format.sh
+++ b/t/t6006-rev-list-format.sh
@@ -184,7 +184,7 @@ Test printing of complex bodies
This commit message is much longer than the others,
and it will be encoded in iso8859-1. We should therefore
-include an iso8859 character: ¡bueno!
+include an iso8859 character: ¡bueno!
EOF
test_expect_success 'setup complex body' '
git config i18n.commitencoding iso8859-1 &&
@@ -192,14 +192,14 @@ git config i18n.commitencoding iso8859-1 &&
'
test_format complex-encoding %e <<'EOF'
-commit f58db70b055c5718631e5c61528b28b12090cdea
+commit 1ed88da4a5b5ed8c449114ac131efc62178734c3
iso8859-1
commit 131a310eb913d107dd3c09a65d1651175898735d
commit 86c75cfd708a0e5868dc876ed5b8bb66c80b4873
EOF
test_format complex-subject %s <<'EOF'
-commit f58db70b055c5718631e5c61528b28b12090cdea
+commit 1ed88da4a5b5ed8c449114ac131efc62178734c3
Test printing of complex bodies
commit 131a310eb913d107dd3c09a65d1651175898735d
changed foo
@@ -208,17 +208,17 @@ added foo
EOF
test_format complex-body %b <<'EOF'
-commit f58db70b055c5718631e5c61528b28b12090cdea
+commit 1ed88da4a5b5ed8c449114ac131efc62178734c3
This commit message is much longer than the others,
and it will be encoded in iso8859-1. We should therefore
-include an iso8859 character: ¡bueno!
+include an iso8859 character: ¡bueno!
commit 131a310eb913d107dd3c09a65d1651175898735d
commit 86c75cfd708a0e5868dc876ed5b8bb66c80b4873
EOF
test_expect_success '%x00 shows NUL' '
- echo >expect commit f58db70b055c5718631e5c61528b28b12090cdea &&
+ echo >expect commit 1ed88da4a5b5ed8c449114ac131efc62178734c3 &&
echo >>expect fooQbar &&
git rev-list -1 --format=foo%x00bar HEAD >actual.nul &&
nul_to_q <actual.nul >actual &&
--
1.8.2.83.gc99314b
next prev parent reply other threads:[~2013-03-16 2:32 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-16 2:24 [PATCH 00/12] Layout control placeholders for pretty format Nguyễn Thái Ngọc Duy
2013-03-16 2:24 ` [PATCH 01/12] pretty-formats.txt: wrap long lines Nguyễn Thái Ngọc Duy
2013-03-16 2:24 ` [PATCH 02/12] pretty: share code between format_decoration and show_decorations Nguyễn Thái Ngọc Duy
2013-03-16 2:24 ` [PATCH 03/12] utf8.c: move display_mode_esc_sequence_len() for use by other functions Nguyễn Thái Ngọc Duy
2013-03-16 2:24 ` [PATCH 04/12] utf8.c: add utf8_strnwidth() with the ability to skip ansi sequences Nguyễn Thái Ngọc Duy
2013-03-16 2:24 ` [PATCH 05/12] pretty: save commit encoding from logmsg_reencode if the caller needs it Nguyễn Thái Ngọc Duy
2013-03-17 8:57 ` Eric Sunshine
2013-03-16 2:24 ` [PATCH 06/12] pretty: get the correct encoding for --pretty:format=%e Nguyễn Thái Ngọc Duy
2013-03-16 2:24 ` [PATCH 07/12] utf8: keep NULs in reencode_string() Nguyễn Thái Ngọc Duy
2013-03-16 2:24 ` Nguyễn Thái Ngọc Duy [this message]
2013-03-16 2:24 ` [PATCH 09/12] pretty: add %C(auto) for auto-coloring on the next placeholder Nguyễn Thái Ngọc Duy
2013-03-17 8:59 ` Eric Sunshine
2013-03-16 2:24 ` [PATCH 10/12] pretty: support padding placeholders, %< %> and %>< Nguyễn Thái Ngọc Duy
2013-03-17 9:03 ` Eric Sunshine
2013-03-16 2:24 ` [PATCH 11/12] pretty: support truncating in %>, %< " Nguyễn Thái Ngọc Duy
2013-03-16 9:04 ` Paul Campbell
2013-03-16 2:24 ` [PATCH 12/12] pretty: support %>> that steal trailing spaces Nguyễn Thái Ngọc Duy
2013-03-17 9:06 ` Eric Sunshine
2013-03-30 9:31 ` Duy Nguyen
2013-03-30 9:35 ` [PATCH v2 00/12] Layout control placeholders for pretty format Nguyễn Thái Ngọc Duy
2013-03-30 9:35 ` [PATCH v2 01/12] pretty-formats.txt: wrap long lines Nguyễn Thái Ngọc Duy
2013-03-30 9:35 ` [PATCH v2 02/12] pretty: share code between format_decoration and show_decorations Nguyễn Thái Ngọc Duy
2013-04-01 17:53 ` Junio C Hamano
2013-04-05 7:57 ` Jakub Narębski
2013-04-12 23:36 ` Duy Nguyen
2013-04-12 23:34 ` Duy Nguyen
2013-03-30 9:35 ` [PATCH v2 03/12] utf8.c: move display_mode_esc_sequence_len() for use by other functions Nguyễn Thái Ngọc Duy
2013-03-30 9:35 ` [PATCH v2 04/12] utf8.c: add utf8_strnwidth() with the ability to skip ansi sequences Nguyễn Thái Ngọc Duy
2013-04-01 18:04 ` Junio C Hamano
2013-03-30 9:35 ` [PATCH v2 05/12] pretty: save commit encoding from logmsg_reencode if the caller needs it Nguyễn Thái Ngọc Duy
2013-04-01 18:10 ` Junio C Hamano
2013-03-30 9:35 ` [PATCH v2 06/12] pretty: get the correct encoding for --pretty:format=%e Nguyễn Thái Ngọc Duy
2013-03-30 9:35 ` [PATCH v2 07/12] utf8: keep NULs in reencode_string() Nguyễn Thái Ngọc Duy
2013-03-30 17:06 ` Torsten Bögershausen
2013-03-31 0:23 ` Duy Nguyen
2013-03-30 9:35 ` [PATCH v2 08/12] pretty: two phase conversion for non utf-8 commits Nguyễn Thái Ngọc Duy
2013-03-30 9:35 ` [PATCH v2 09/12] pretty: add %C(auto) for auto-coloring on the next placeholder Nguyễn Thái Ngọc Duy
2013-04-01 18:26 ` Junio C Hamano
2013-04-05 2:21 ` Duy Nguyen
2013-04-05 17:13 ` Junio C Hamano
2013-04-15 9:54 ` Duy Nguyen
2013-03-30 9:35 ` [PATCH v2 10/12] pretty: support padding placeholders, %< %> and %>< Nguyễn Thái Ngọc Duy
2013-03-30 9:35 ` [PATCH v2 11/12] pretty: support truncating in %>, %< " Nguyễn Thái Ngọc Duy
2013-03-30 9:35 ` [PATCH v2 12/12] pretty: support %>> that steal trailing spaces Nguyễn Thái Ngọc Duy
2013-04-01 18:39 ` Junio C Hamano
2013-04-16 8:24 ` [PATCH v3 00/13] nd/pretty-formats Nguyễn Thái Ngọc Duy
2013-04-16 8:24 ` [PATCH v3 01/13] pretty: save commit encoding from logmsg_reencode if the caller needs it Nguyễn Thái Ngọc Duy
2013-04-16 8:24 ` [PATCH v3 02/13] pretty: get the correct encoding for --pretty:format=%e Nguyễn Thái Ngọc Duy
2013-04-16 8:24 ` [PATCH v3 03/13] pretty-formats.txt: wrap long lines Nguyễn Thái Ngọc Duy
2013-04-16 8:24 ` [PATCH v3 04/13] pretty: share code between format_decoration and show_decorations Nguyễn Thái Ngọc Duy
2013-04-16 8:24 ` [PATCH v3 05/13] utf8.c: move display_mode_esc_sequence_len() for use by other functions Nguyễn Thái Ngọc Duy
2013-04-16 8:24 ` [PATCH v3 06/13] utf8.c: add utf8_strnwidth() with the ability to skip ansi sequences Nguyễn Thái Ngọc Duy
2013-04-16 8:24 ` [PATCH v3 07/13] utf8.c: add reencode_string_len() that can handle NULs in string Nguyễn Thái Ngọc Duy
2013-04-16 8:30 ` Duy Nguyen
2013-04-18 17:25 ` Junio C Hamano
2013-04-16 8:24 ` [PATCH v3 08/13] pretty: two phase conversion for non utf-8 commits Nguyễn Thái Ngọc Duy
2013-04-16 8:24 ` [PATCH v3 09/13] pretty: split color parsing into a separate function Nguyễn Thái Ngọc Duy
2013-04-16 8:24 ` [PATCH v3 10/13] pretty: add %C(auto) for auto-coloring Nguyễn Thái Ngọc Duy
2013-04-16 21:33 ` Junio C Hamano
2013-04-17 9:55 ` Duy Nguyen
2013-04-17 15:28 ` Junio C Hamano
2013-04-16 21:37 ` Junio C Hamano
2013-04-16 8:25 ` [PATCH v3 11/13] pretty: support padding placeholders, %< %> and %>< Nguyễn Thái Ngọc Duy
2013-04-16 20:41 ` Junio C Hamano
2013-04-16 20:43 ` Junio C Hamano
2013-04-17 9:45 ` Duy Nguyen
2013-04-16 8:25 ` [PATCH v3 12/13] pretty: support truncating in %>, %< " Nguyễn Thái Ngọc Duy
2013-04-16 8:25 ` [PATCH v3 13/13] pretty: support %>> that steal trailing spaces Nguyễn Thái Ngọc Duy
[not found] ` <516D57BD.7080208@web.de>
2013-04-16 14:47 ` [PATCH v3 00/13] nd/pretty-formats Torsten Bögershausen
2013-04-18 23:08 ` [PATCH v4 " Nguyễn Thái Ngọc Duy
2013-04-18 23:08 ` [PATCH v4 01/13] pretty: save commit encoding from logmsg_reencode if the caller needs it Nguyễn Thái Ngọc Duy
2013-04-18 23:08 ` [PATCH v4 02/13] pretty: get the correct encoding for --pretty:format=%e Nguyễn Thái Ngọc Duy
2013-04-18 23:08 ` [PATCH v4 03/13] pretty-formats.txt: wrap long lines Nguyễn Thái Ngọc Duy
2013-04-18 23:08 ` [PATCH v4 04/13] pretty: share code between format_decoration and show_decorations Nguyễn Thái Ngọc Duy
2013-04-18 23:08 ` [PATCH v4 05/13] utf8.c: move display_mode_esc_sequence_len() for use by other functions Nguyễn Thái Ngọc Duy
2013-04-18 23:08 ` [PATCH v4 06/13] utf8.c: add utf8_strnwidth() with the ability to skip ansi sequences Nguyễn Thái Ngọc Duy
2013-04-18 23:08 ` [PATCH v4 07/13] utf8.c: add reencode_string_len() that can handle NULs in string Nguyễn Thái Ngọc Duy
2013-04-18 23:08 ` [PATCH v4 08/13] pretty: two phase conversion for non utf-8 commits Nguyễn Thái Ngọc Duy
2013-04-18 23:08 ` [PATCH v4 09/13] pretty: split color parsing into a separate function Nguyễn Thái Ngọc Duy
2013-04-18 23:08 ` [PATCH v4 10/13] pretty: add %C(auto) for auto-coloring Nguyễn Thái Ngọc Duy
2013-04-18 23:08 ` [PATCH v4 11/13] pretty: support padding placeholders, %< %> and %>< Nguyễn Thái Ngọc Duy
2013-04-18 23:08 ` [PATCH v4 12/13] pretty: support truncating in %>, %< " Nguyễn Thái Ngọc Duy
2013-04-18 23:08 ` [PATCH v4 13/13] pretty: support %>> that steal trailing spaces Nguyễn Thái Ngọc Duy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1363400683-14813-9-git-send-email-pclouds@gmail.com \
--to=pclouds@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).