git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>,
	Sven Strickroth <sven.strickroth@tu-clausthal.de>,
	git@vger.kernel.org
Subject: Re: git archive --format zip utf-8 issues
Date: Fri, 21 Sep 2012 00:00:09 +0200	[thread overview]
Message-ID: <505B91E9.7060208@lsrfire.ath.cx> (raw)
In-Reply-To: <7v1uhzkpcc.fsf@alter.siamese.dyndns.org>

Am 18.09.2012 23:12, schrieb Junio C Hamano:
> René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
>
>>                                           Windows    Info-ZIP unzip
>>                              7-Zip PeaZip builtin Linux msysgit Windows
>> 7-Zip 9.20                      0      0      46    26      43      43
>> PeaZip 4.7.1 win64              0      0      46    26      42      42
>> Info-ZIP zip 3.0 Linux          0      0      72     0      43      43
>> Info-ZIP zip 3.0 Windows       45     45     n/a     0      43      43

> It is kind of surprising that "Windows builtin" has very poor score
> extracting from the output of Zip tools running on Windows (I am
> looking at 46, 46 and n/a over there).  If you tell it to create an
> archive from its disk and then extract from it, I wonder what would
> happen.

I didn't include it as a packer because it refused to archive the 
pangrams directory due to illegal characters in one of the filenames. 
When I just tried a bit harder, I had to delete all but 14 files with 
Latin script, accents etc. before I could zip the directory.  I'll 
include these results in the next round.

It uses codepage 850 on my system (MSDOS Latin 1).  I don't expect this 
to be portable.

> Does this result mean that practically nobody uses Zip archive with
> exotic letters in paths on that platform?  I am not talking about
> developers and savvy people who know where to download third-party
> Zip archivers and how to install them.  I am imagining a grandma who
> received an archive full of photos of her grandchild in her Outlook
> Express or GMail inbox, clicked the attachment to download it, and
> is trying to view the photo inside.

Not necessarily.  Photos often have names like img_0123.jpg etc., which 
are handled just fine.  And all family members probably use the same 
codepage on their computers, so they're less likely to run into this 
problem.

By the way, I found this bug asking for codepage support in unzip:

   https://bugs.launchpad.net/ubuntu/+source/unzip/+bug/580961

Multiple codepages seem to be used for ZIP files in the wild, none of 
them are supported by unzip on Linux, which only accepts ASCII or UTF-8.

René

  reply	other threads:[~2012-09-20 22:00 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-10 21:58 git archive --format zip utf-8 issues Sven Strickroth
2012-08-10 22:47 ` Junio C Hamano
2012-08-10 23:53   ` Sven Strickroth
2012-08-11 20:53     ` René Scharfe
2012-08-12  4:08       ` Junio C Hamano
2012-08-11 20:53   ` René Scharfe
2012-08-11 21:37     ` Sven Strickroth
2012-08-30 22:26       ` Jeff King
2012-09-04 20:23         ` René Scharfe
2012-09-04 21:03           ` Junio C Hamano
2012-09-05 19:36             ` René Scharfe
2012-09-18 19:40               ` René Scharfe
2012-09-18 19:46                 ` [PATCH 1/2] archive-zip: support UTF-8 paths René Scharfe
2012-09-18 19:53                 ` [PATCH 2/2] archive-zip: declare creator to be Unix for " René Scharfe
2012-09-18 20:24                 ` git archive --format zip utf-8 issues René Scharfe
2012-09-18 21:12                 ` Junio C Hamano
2012-09-20 22:00                   ` René Scharfe [this message]
2012-09-24 15:56                     ` René Scharfe
2012-09-24 18:13                       ` Junio C Hamano
2012-09-24 15:56                 ` [PATCH 3/2] archive-zip: write extended timestamp René Scharfe
2012-08-12  4:27     ` git archive --format zip utf-8 issues Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=505B91E9.7060208@lsrfire.ath.cx \
    --to=rene.scharfe@lsrfire.ath.cx \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=sven.strickroth@tu-clausthal.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).