git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Support for EBCDIC
@ 2014-07-03  2:39 Scott McKellar
  2014-07-03 17:34 ` Jeff King
  0 siblings, 1 reply; 3+ messages in thread
From: Scott McKellar @ 2014-07-03  2:39 UTC (permalink / raw)
  To: git@vger.kernel.org   

Is Git supposed to be usable in an environment where the execution character set is EBCDIC?

I ask because, in browsing the source code (version 2.0.0), I stumbled across three functions 

that won't work as presumably intended in an EBCDIC environment (strihash(), memihash(), and 

git_user_agent_sanitized()).  I can report them as bugs, but if EBCDIC is considered out of 

scope, then they aren't bugs.

These three functions can be readily fixed to make them portable across character sets.  There may be other spots that are harder to fix.

I have done a lot of grepping and Googling, but I haven't found a clear, authoritative answer 

to this question.  From searching this mailing list, it appears that nobody is interested in 

supporting EBCDIC.  However I found one wiki page describing how to run Git on an IBM i, which 

is an EBCDIC-based successor to the AS/400 series.  See:

    http://wsip-174-79-32-155.ph.ph.cox.net/wiki/index.php/PASE/Git

That installation was reportedly running version 1.7.9.4, which I believe predates the 

introduction of strihash() and memihash(); I don't know about git_user_agent_sanitized().

Mind you, I'm not advocating for EBCDIC.  I escaped from the EBCDIC world about fifteen years 

ago, and have no desire to return.  I just want to know if character set issues are worth 

reporting.  The same issues may arise for other, more obscure character sets.


Scott McKellar

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Support for EBCDIC
  2014-07-03  2:39 Support for EBCDIC Scott McKellar
@ 2014-07-03 17:34 ` Jeff King
  2014-07-03 20:01   ` Jason Pyeron
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2014-07-03 17:34 UTC (permalink / raw)
  To: Scott McKellar; +Cc: git@vger.kernel.org   

On Wed, Jul 02, 2014 at 07:39:12PM -0700, Scott McKellar wrote:

> Is Git supposed to be usable in an environment where the execution character set is EBCDIC?

Not really.

In addition to the cases you found (and I would be surprised if there
are not more, such as our reimplementation of ctype.h), we assume:

  - we can intermingle ASCII from string literals with user data to form
    diffs, commit objects, network protocols, etc. This is actually a
    problem not just for EBCDIC, but for any encoding which is not an
    ASCII-superset (like UTF-16).

  - many outputs from git should be ASCII in order to interoperate with
    the outside world (object headers, network protocols, etc).

So I'd be surprised if things worked well in an EBCDIC environment (but
I have never worked with one, so maybe I do not understand all of the
implications).

-Peff

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: Support for EBCDIC
  2014-07-03 17:34 ` Jeff King
@ 2014-07-03 20:01   ` Jason Pyeron
  0 siblings, 0 replies; 3+ messages in thread
From: Jason Pyeron @ 2014-07-03 20:01 UTC (permalink / raw)
  To: git; +Cc: 'Jeff King', 'Scott McKellar'

> -----Original Message-----
> From: Jeff King
> Sent: Thursday, July 03, 2014 13:34
> 
> On Wed, Jul 02, 2014 at 07:39:12PM -0700, Scott McKellar wrote:
> 
> > Is Git supposed to be usable in an environment where the 
> execution character set is EBCDIC?
> 
> Not really.

If the core uses specific 8bit values for the internals then there is a hope and
prayer.

E.g. "blob" would need to be char _BLOB={0x62,0x6c,0x6f,0x62} because the hash
calculation would be wrong if were {0x82,0x93,0x96,0x82} ensuring the compiler
does not change that "binary" data value.

> 
> In addition to the cases you found (and I would be surprised if there
> are not more, such as our reimplementation of ctype.h), we assume:
> 
>   - we can intermingle ASCII from string literals with user 
> data to form
>     diffs, commit objects, network protocols, etc. This is actually a
>     problem not just for EBCDIC, but for any encoding which is not an
>     ASCII-superset (like UTF-16).

And then all output would require code-page aware translation, but fix the above
first.

> 
>   - many outputs from git should be ASCII in order to 
> interoperate with
>     the outside world (object headers, network protocols, etc).
> 
> So I'd be surprised if things worked well in an EBCDIC 
> environment (but
> I have never worked with one, so maybe I do not understand all of the
> implications).

--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-                                                               -
- Jason Pyeron                      PD Inc. http://www.pdinc.us -
- Principal Consultant              10 West 24th Street #100    -
- +1 (443) 269-1555 x333            Baltimore, Maryland 21218   -
-                                                               -
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
This message is copyright PD Inc, subject to license 20080407P00.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-07-03 20:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-03  2:39 Support for EBCDIC Scott McKellar
2014-07-03 17:34 ` Jeff King
2014-07-03 20:01   ` Jason Pyeron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).