* Support for EBCDIC
@ 2014-07-03 2:39 Scott McKellar
2014-07-03 17:34 ` Jeff King
0 siblings, 1 reply; 3+ messages in thread
From: Scott McKellar @ 2014-07-03 2:39 UTC (permalink / raw)
To: git@vger.kernel.org
Is Git supposed to be usable in an environment where the execution character set is EBCDIC?
I ask because, in browsing the source code (version 2.0.0), I stumbled across three functions
that won't work as presumably intended in an EBCDIC environment (strihash(), memihash(), and
git_user_agent_sanitized()). I can report them as bugs, but if EBCDIC is considered out of
scope, then they aren't bugs.
These three functions can be readily fixed to make them portable across character sets. There may be other spots that are harder to fix.
I have done a lot of grepping and Googling, but I haven't found a clear, authoritative answer
to this question. From searching this mailing list, it appears that nobody is interested in
supporting EBCDIC. However I found one wiki page describing how to run Git on an IBM i, which
is an EBCDIC-based successor to the AS/400 series. See:
http://wsip-174-79-32-155.ph.ph.cox.net/wiki/index.php/PASE/Git
That installation was reportedly running version 1.7.9.4, which I believe predates the
introduction of strihash() and memihash(); I don't know about git_user_agent_sanitized().
Mind you, I'm not advocating for EBCDIC. I escaped from the EBCDIC world about fifteen years
ago, and have no desire to return. I just want to know if character set issues are worth
reporting. The same issues may arise for other, more obscure character sets.
Scott McKellar
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Support for EBCDIC
2014-07-03 2:39 Support for EBCDIC Scott McKellar
@ 2014-07-03 17:34 ` Jeff King
2014-07-03 20:01 ` Jason Pyeron
0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2014-07-03 17:34 UTC (permalink / raw)
To: Scott McKellar; +Cc: git@vger.kernel.org
On Wed, Jul 02, 2014 at 07:39:12PM -0700, Scott McKellar wrote:
> Is Git supposed to be usable in an environment where the execution character set is EBCDIC?
Not really.
In addition to the cases you found (and I would be surprised if there
are not more, such as our reimplementation of ctype.h), we assume:
- we can intermingle ASCII from string literals with user data to form
diffs, commit objects, network protocols, etc. This is actually a
problem not just for EBCDIC, but for any encoding which is not an
ASCII-superset (like UTF-16).
- many outputs from git should be ASCII in order to interoperate with
the outside world (object headers, network protocols, etc).
So I'd be surprised if things worked well in an EBCDIC environment (but
I have never worked with one, so maybe I do not understand all of the
implications).
-Peff
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: Support for EBCDIC
2014-07-03 17:34 ` Jeff King
@ 2014-07-03 20:01 ` Jason Pyeron
0 siblings, 0 replies; 3+ messages in thread
From: Jason Pyeron @ 2014-07-03 20:01 UTC (permalink / raw)
To: git; +Cc: 'Jeff King', 'Scott McKellar'
> -----Original Message-----
> From: Jeff King
> Sent: Thursday, July 03, 2014 13:34
>
> On Wed, Jul 02, 2014 at 07:39:12PM -0700, Scott McKellar wrote:
>
> > Is Git supposed to be usable in an environment where the
> execution character set is EBCDIC?
>
> Not really.
If the core uses specific 8bit values for the internals then there is a hope and
prayer.
E.g. "blob" would need to be char _BLOB={0x62,0x6c,0x6f,0x62} because the hash
calculation would be wrong if were {0x82,0x93,0x96,0x82} ensuring the compiler
does not change that "binary" data value.
>
> In addition to the cases you found (and I would be surprised if there
> are not more, such as our reimplementation of ctype.h), we assume:
>
> - we can intermingle ASCII from string literals with user
> data to form
> diffs, commit objects, network protocols, etc. This is actually a
> problem not just for EBCDIC, but for any encoding which is not an
> ASCII-superset (like UTF-16).
And then all output would require code-page aware translation, but fix the above
first.
>
> - many outputs from git should be ASCII in order to
> interoperate with
> the outside world (object headers, network protocols, etc).
>
> So I'd be surprised if things worked well in an EBCDIC
> environment (but
> I have never worked with one, so maybe I do not understand all of the
> implications).
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
- -
- Jason Pyeron PD Inc. http://www.pdinc.us -
- Principal Consultant 10 West 24th Street #100 -
- +1 (443) 269-1555 x333 Baltimore, Maryland 21218 -
- -
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
This message is copyright PD Inc, subject to license 20080407P00.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-07-03 20:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-03 2:39 Support for EBCDIC Scott McKellar
2014-07-03 17:34 ` Jeff King
2014-07-03 20:01 ` Jason Pyeron
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).