git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Is my repository broken?
@ 2008-04-08 22:02 Julian Phillips
  2008-04-08 22:13 ` Junio C Hamano
  2008-04-08 22:55 ` Shawn O. Pearce
  0 siblings, 2 replies; 6+ messages in thread
From: Julian Phillips @ 2008-04-08 22:02 UTC (permalink / raw)
  To: git

Having just converted a subversion repository to git using a custom 
fast-import based script, I realised that I had forgotten to add some 
users to the mapping table - so I thought that filter-branch would save me 
from having to redo the 72hr import.

At which point I discovered that I had a number of commits that git didn't 
like.  In particular I had a number of commits with an empty ident, and a 
number with more than 16 parents (actually mostly these are the same 
commits - but I don't think that's relevant).  When filter-branch tries to 
process these commits it can't commit them - and so I end up losing a 
number of refs.

I can fix the empty ident using the msg-filter in filter-branch, and I can 
work around the 16 parent limit by building a custom git with MAX_PARENT = 
32.  However - will having commits with more than 16 parents break things 
for un-modified git?  The commits in question were originally created in 
CVS, so the parenting is kinda 'real' for some value of real - but we are 
talking revision 4000/40000, so I don't mind arbitrarily limiting the 
commits to 16 parents, but I would prefer to avoid redoing the 72hr import 
if possible.  (Perhaps I can also 'fix' the parenting in filter-branch?)

Also, shouldn't fast-import be imposing the same restrictions on what you 
are allowed to commit that the main git tools do?  If not, are such 
restrictions documented so that I can apply them in my conversion script?

(28G Subversion repository to 7.1G git repository (before repacking) ... 
not bad :D)

-- 
Julian

  ---
Agnes' Law:
 	Almost everything in life is easier to get into than out of.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is my repository broken?
  2008-04-08 22:02 Is my repository broken? Julian Phillips
@ 2008-04-08 22:13 ` Junio C Hamano
  2008-04-08 22:55 ` Shawn O. Pearce
  1 sibling, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2008-04-08 22:13 UTC (permalink / raw)
  To: Julian Phillips; +Cc: git

Julian Phillips <julian@quantumfyre.co.uk> writes:

> ...  However - will having commits with more than 16
> parents break things for un-modified git?

Offhand, 82d853d (builtin-blame.c: allow more than 16 parents, 2008-04-03)
comes to mind.  Also I would not be surprised if Porcelain scripts on the
periphery had trouble with commits with duplicated parents and such
esotericos, though.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is my repository broken?
  2008-04-08 22:02 Is my repository broken? Julian Phillips
  2008-04-08 22:13 ` Junio C Hamano
@ 2008-04-08 22:55 ` Shawn O. Pearce
  2008-04-08 23:21   ` Julian Phillips
  1 sibling, 1 reply; 6+ messages in thread
From: Shawn O. Pearce @ 2008-04-08 22:55 UTC (permalink / raw)
  To: Julian Phillips; +Cc: git

Julian Phillips <julian@quantumfyre.co.uk> wrote:
> [...] In particular I had a number of commits with an empty ident [...]
...
> Also, shouldn't fast-import be imposing the same restrictions on what you 
> are allowed to commit that the main git tools do?  If not, are such 
> restrictions documented so that I can apply them in my conversion script?

Hmm, no.  fast-import allows what the generalized data model permits
in the object store, its really plumbing.  If you are feeding it
an input stream that creates data that isn't compliant with what
the higher level VCS porcelain wants, well, all I can say is "don't
do that".

The fast-import manual specifically warns in the "merge" command
documentation that you may not want to use more than 15 merge
commands, as it can create a commit that other tools based around
git won't like.  But we still let you do it.

We also still let you create a commit with duplicate parents.
Some tools (gitk) have had issues with that in the past, but many of
them have been fixed after a fast-import result was used with them.
From a VCS point of view its silly to list the same ancestor twice.
But from an object model point of view, it may make sense if you
were building something else on top of the core plumbing.

The same holds true for the empty ident.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is my repository broken?
  2008-04-08 22:55 ` Shawn O. Pearce
@ 2008-04-08 23:21   ` Julian Phillips
  2008-04-09  6:50     ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Julian Phillips @ 2008-04-08 23:21 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git

On Tue, 8 Apr 2008, Shawn O. Pearce wrote:

> Julian Phillips <julian@quantumfyre.co.uk> wrote:
>> [...] In particular I had a number of commits with an empty ident [...]
> ...
>> Also, shouldn't fast-import be imposing the same restrictions on what you
>> are allowed to commit that the main git tools do?  If not, are such
>> restrictions documented so that I can apply them in my conversion script?
>
> Hmm, no.  fast-import allows what the generalized data model permits
> in the object store, its really plumbing.  If you are feeding it
> an input stream that creates data that isn't compliant with what
> the higher level VCS porcelain wants, well, all I can say is "don't
> do that".

Well, the 16 parent limit is enforced in builtin-commit-tree.c, and the 
commit-tree command is listed as plumbingmanipulators in command-list.txt 
- and the #define line actually blames all the way back to Linus' original 
commit.  So that's not really a porcelain thing ;), but ... *shrugs*

> The fast-import manual specifically warns in the "merge" command
> documentation that you may not want to use more than 15 merge
> commands, as it can create a commit that other tools based around
> git won't like.  But we still let you do it.

Ok, so the answer is "read the manpage" ... that's what I get for spending 
so long playing with fast-import that I don't read the manpage anymore 
(I found the syntax description in fast-import.c a more useful 
reference) ... ho hum.

> We also still let you create a commit with duplicate parents.

Well - that certainly shouldn't be happening - I remebered to check for 
that one.

> Some tools (gitk) have had issues with that in the past, but many of
> them have been fixed after a fast-import result was used with them.
>> From a VCS point of view its silly to list the same ancestor twice.
> But from an object model point of view, it may make sense if you
> were building something else on top of the core plumbing.

Would it make sense perhaps for fast-import to warn about (or even error 
out on) such things unless you tell it not to?  Most people would probably 
want to know if they were creating a repository that wasn't going to play 
nice with git's main tool suite?  (I didn't realise that I _was_ creating 
>16 way merges until filter-branch told me).

> The same holds true for the empty ident.

Ok - but I can't even find a note in the manpage for this one ...

-- 
Julian

  ---
I thought there was chocolate inside ... Well, why was it wrapped in foil?

 		-- Homer Simpson
 		   Mr. Plow

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is my repository broken?
  2008-04-08 23:21   ` Julian Phillips
@ 2008-04-09  6:50     ` Junio C Hamano
  2008-04-09 10:01       ` Julian Phillips
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2008-04-09  6:50 UTC (permalink / raw)
  To: Julian Phillips; +Cc: Shawn O. Pearce, git

Julian Phillips <julian@quantumfyre.co.uk> writes:

> On Tue, 8 Apr 2008, Shawn O. Pearce wrote:
>
>> The same holds true for the empty ident.
>
> Ok - but I can't even find a note in the manpage for this one ...

That's not a fair complaint.

It is often very hard to document that "we do not do X", because the line
to stop at becomes fuzzier as you try to do more thorough job.  We do not
warn on empty ident, we do not warn on typos in commit log messages, we do
not warn on empty blob, we do not warn on ...  You get the idea.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is my repository broken?
  2008-04-09  6:50     ` Junio C Hamano
@ 2008-04-09 10:01       ` Julian Phillips
  0 siblings, 0 replies; 6+ messages in thread
From: Julian Phillips @ 2008-04-09 10:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

On Tue, 8 Apr 2008, Junio C Hamano wrote:

> Julian Phillips <julian@quantumfyre.co.uk> writes:
>
>> On Tue, 8 Apr 2008, Shawn O. Pearce wrote:
>>
>>> The same holds true for the empty ident.
>>
>> Ok - but I can't even find a note in the manpage for this one ...
>
> That's not a fair complaint.

I didn't mean it as a complaint, but rather was hoping for a response of 
the form "no, it's not there" or "it's in the ... section" - sorry for not 
being clear.  It caught me out, if that was my own fault then fair enough 
- but if this was because the documentation doesn't make it clear then I 
can submit a documentation patch to try and help others avoid the same 
problem.

I think that the fast-import tool is extremely useful, and generally very 
well documented.  That doesn't mean that the documentation can't be 
improved though.

> It is often very hard to document that "we do not do X", because the line
> to stop at becomes fuzzier as you try to do more thorough job.  We do not
> warn on empty ident, we do not warn on typos in commit log messages, we do
> not warn on empty blob, we do not warn on ...  You get the idea.

Git doesn't die when trying to commit typos though ...  I think "creating 
a commit that you could not create using git-commit" is a pretty hard 
line.

I don't think that it is entirely unreasonable to expect that when an 
existing repository is run through a tool like filter-branch that all your 
existing commits are preserved - and that you don't lose large chunks 
because it turns out that they are actually invalid by the rules of 
git-commit.

I accept that you may want fast-import to create things that are 
technically illegal, but at the very least it ought to be possible to find 
out what restrictions are not being enforced.  Otherwise it might be that 
you manage to destroy a previously functioning repository by accident long 
after you thought you had successfully converted your respository.  After 
all, I would have thought that the majority of people using fast-import 
(either directly, or indirectly by using a fast-import based importer) 
would actually intend to use the repository created with the normal git 
tools from then on.

-- 
Julian

  ---
I have often regretted my speech, never my silence.
 		-- Publilius Syrus

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-04-09 10:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-08 22:02 Is my repository broken? Julian Phillips
2008-04-08 22:13 ` Junio C Hamano
2008-04-08 22:55 ` Shawn O. Pearce
2008-04-08 23:21   ` Julian Phillips
2008-04-09  6:50     ` Junio C Hamano
2008-04-09 10:01       ` Julian Phillips

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).