git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* A shortcoming of the git repo format
@ 2005-04-27  5:43 H. Peter Anvin
  2005-04-27 15:00 ` C. Scott Ananian
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: H. Peter Anvin @ 2005-04-27  5:43 UTC (permalink / raw)
  To: Git Mailing List

Most of git's files are starting to converge toward an RFC822-like 
header with (tag, data) and a free-form section.  This is a good thing. 
  However, there is one problem with this, and that is that without 
knowing every possible tag, a program reading the git repository cannot 
safely tell what is a link to another git object and what is not.  When 
I did my repository conversion tools, I simply assumed any string of 20 
hexadecimal digits was a pointer, but this is probably a bad idea in the 
long run.

Additionally, there is the question of the handling of strings that may 
contain \n or even \0 (which may be necessary for some applications).

One solution to all of this would be to define a quoting standard for 
strings, and simply require that all free-format strings (like the 
author fields) or at least strings that match [0-9a-f]{20}, are always 
quoted.

I propose the following:

- Any string containing control characters or \ must be quoted;
- \xXX produces control characters; other characters following \ are 
verbatim.

Thus,

link 0123456789abcdef0123

... is a link to an object, whereas ...

string \0123456789abcdef0123

... is a string.

string1  This string begins with a space
string2 This string has an embedded newline ("\x0a")

... are both valid strings; the first contains a leading space and the 
second an embedded newline.

I'll implement this and integrate it tomorrow.

	-hpa



^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2005-04-28 15:00 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-27  5:43 A shortcoming of the git repo format H. Peter Anvin
2005-04-27 15:00 ` C. Scott Ananian
2005-04-27 15:22 ` Linus Torvalds
2005-04-27 18:03   ` H. Peter Anvin
2005-04-27 18:32     ` Dave Jones
2005-04-27 18:47       ` H. Peter Anvin
2005-04-27 22:51         ` Jon Seymour
2005-04-27 19:15       ` Linus Torvalds
2005-04-27 19:39       ` Petr Baudis
2005-04-27 19:11     ` Linus Torvalds
2005-04-27 19:47       ` The " Brian O'Mahoney
2005-04-27 20:40       ` A shortcoming of the " H. Peter Anvin
2005-04-27 20:49         ` Tom Lord
2005-04-27 20:59           ` H. Peter Anvin
2005-04-28  0:57           ` Linus Torvalds
2005-04-28  1:34             ` Paul Jackson
2005-04-28  2:14             ` Tom Lord
2005-04-28  3:37             ` Ryan Anderson
2005-04-28  8:31             ` Morgan Schweers
2005-04-28 15:08             ` Barry Silverman
2005-04-27 20:56         ` Linus Torvalds
2005-04-28  0:45           ` David A. Wheeler
2005-04-28  0:46             ` David Lang
2005-04-27 23:50         ` Daniel Barkalow
2005-04-27 23:56           ` H. Peter Anvin
2005-04-28  1:51             ` Daniel Barkalow
2005-04-28  1:56               ` H. Peter Anvin
2005-04-28 13:39     ` David Woodhouse
2005-04-27 20:58 ` Gerhard Schrenk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).