git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
To: Sven Strickroth <sven.strickroth@tu-clausthal.de>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: git archive --format zip utf-8 issues
Date: Sat, 11 Aug 2012 22:53:37 +0200	[thread overview]
Message-ID: <5026C651.8050705@lsrfire.ath.cx> (raw)
In-Reply-To: <50259EEC.2020701@tu-clausthal.de>

Am 11.08.2012 01:53, schrieb Sven Strickroth:
> Am 11.08.2012 00:47 schrieb Junio C Hamano:
>> Do you know in what encoding the pathnames are _expected_ to be
>> stored in zip archives?
>
> re-encoding to latin1 does not always work and may break double byte
> totally (e.g. chinese or japanese).
>
> PKZIP APPNOTE seems to be the zip standard and it specifies a utf-8
> flag: http://www.pkware.com/documents/casestudies/APPNOTE.TXT
>> A.  Local file header:
>> general purpose bit flag: (2 bytes)
>> Bit 11: Language encoding flag (EFS).  If this bit is
>> set, the filename and comment fields for this file
>> must be encoded using UTF-8. (see APPENDIX D)

Yes, that's one of the two methods for supporting UTF-8 filenames 
described there.

The other method involves writing extra ZIP header fields and was 
invented by Info-ZIP. They don't use it consistently anymore, though 
(from zip -h2):

  "Zip now stores UTF-8 in entry path and comment fields on systems
   where UTF-8 char set is default, such as most modern Unix, and
   and on other systems in new extra fields with escaped versions in
   entry path and comment fields for backward compatibility."

René

  reply	other threads:[~2012-08-11 20:53 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-10 21:58 git archive --format zip utf-8 issues Sven Strickroth
2012-08-10 22:47 ` Junio C Hamano
2012-08-10 23:53   ` Sven Strickroth
2012-08-11 20:53     ` René Scharfe [this message]
2012-08-12  4:08       ` Junio C Hamano
2012-08-11 20:53   ` René Scharfe
2012-08-11 21:37     ` Sven Strickroth
2012-08-30 22:26       ` Jeff King
2012-09-04 20:23         ` René Scharfe
2012-09-04 21:03           ` Junio C Hamano
2012-09-05 19:36             ` René Scharfe
2012-09-18 19:40               ` René Scharfe
2012-09-18 19:46                 ` [PATCH 1/2] archive-zip: support UTF-8 paths René Scharfe
2012-09-18 19:53                 ` [PATCH 2/2] archive-zip: declare creator to be Unix for " René Scharfe
2012-09-18 20:24                 ` git archive --format zip utf-8 issues René Scharfe
2012-09-18 21:12                 ` Junio C Hamano
2012-09-20 22:00                   ` René Scharfe
2012-09-24 15:56                     ` René Scharfe
2012-09-24 18:13                       ` Junio C Hamano
2012-09-24 15:56                 ` [PATCH 3/2] archive-zip: write extended timestamp René Scharfe
2012-08-12  4:27     ` git archive --format zip utf-8 issues Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5026C651.8050705@lsrfire.ath.cx \
    --to=rene.scharfe@lsrfire.ath.cx \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=sven.strickroth@tu-clausthal.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).