* Supressing sorting of trees
@ 2009-10-12 13:27 Sal Mangano
2009-10-12 14:20 ` Shawn O. Pearce
0 siblings, 1 reply; 9+ messages in thread
From: Sal Mangano @ 2009-10-12 13:27 UTC (permalink / raw)
To: git
I am using Git in a non-standard way and need to make a few twaeks in my
custom build. I have added a --nosort option to git mktree which will suppress
the qsort of the tree.
Will this break any other git functions? Are there any commands that assume
trees are always sorted?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Supressing sorting of trees
2009-10-12 13:27 Supressing sorting of trees Sal Mangano
@ 2009-10-12 14:20 ` Shawn O. Pearce
2009-10-12 15:43 ` Sal Mangano
0 siblings, 1 reply; 9+ messages in thread
From: Shawn O. Pearce @ 2009-10-12 14:20 UTC (permalink / raw)
To: Sal Mangano; +Cc: git
Sal Mangano <smangano@into-technology.com> wrote:
> I am using Git in a non-standard way and need to make a few twaeks in my
> custom build. I have added a --nosort option to git mktree which will suppress
> the qsort of the tree.
>
> Will this break any other git functions? Are there any commands that assume
> trees are always sorted?
_YES IT BREAKS GIT_.
You cannot do this.
A Git repository whose trees are not sorted according to the Git
specific sort ordering is severly broken and most tools will fail
horribly on it.
Almost all code which reads trees assumes the names are sorted in a
specific order. These tools perform sorted merges against other tree
like structures. If the names are out of order the merge will fail.
`git fsck` will complain that the tree is not sorted properly.
Tools like `git log -- foo.c` will fail randomly because they break
out of the entry lookup as soon as they find a name that is after
foo.c, as they assume the tree is sorted.
I could go on. But there is no point.
Oh, and trust me when I say this, the tree sorting matters. Long ago
JGit had a bug where it didn't sort trees correctly all of the time
and we had a devil of a time tracking down that corruption.
--
Shawn.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Supressing sorting of trees
2009-10-12 14:20 ` Shawn O. Pearce
@ 2009-10-12 15:43 ` Sal Mangano
2009-10-12 16:05 ` Johannes Schindelin
0 siblings, 1 reply; 9+ messages in thread
From: Sal Mangano @ 2009-10-12 15:43 UTC (permalink / raw)
To: git
Shawn O. Pearce <spearce <at> spearce.org> writes:
>
> Sal Mangano <smangano <at> into-technology.com> wrote:
> > I am using Git in a non-standard way and need to make a few twaeks in my
> > custom build. I have added a --nosort option to git mktree which will
suppress
> > the qsort of the tree.
> >
> > Will this break any other git functions? Are there any commands that assume
> > trees are always sorted?
>
> _YES IT BREAKS GIT_.
>
> You cannot do this.
>
> A Git repository whose trees are not sorted according to the Git
> specific sort ordering is severly broken and most tools will fail
> horribly on it.
>
> Almost all code which reads trees assumes the names are sorted in a
> specific order. These tools perform sorted merges against other tree
> like structures. If the names are out of order the merge will fail.
> `git fsck` will complain that the tree is not sorted properly.
> Tools like `git log -- foo.c` will fail randomly because they break
> out of the entry lookup as soon as they find a name that is after
> foo.c, as they assume the tree is sorted.
>
> I could go on. But there is no point.
>
> Oh, and trust me when I say this, the tree sorting matters. Long ago
> JGit had a bug where it didn't sort trees correctly all of the time
> and we had a devil of a time tracking down that corruption.
>
Thanks Shawn. I get the picture.
Now, let's assume I am stubborn, crazy or both :-)
I can modify fsck to ignore unsorted and at the moment I don't care about
merging trees. If I hunt down all usage of base_name_compare will that identify
all code with the sort assumption or is there other places as well? I can go
through the entire source to figure this out myself but I need to get something
hacked up very quickly and would appreciate help even if you think I am nuts!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Supressing sorting of trees
2009-10-12 15:43 ` Sal Mangano
@ 2009-10-12 16:05 ` Johannes Schindelin
2009-10-12 16:51 ` Sal Mangano
0 siblings, 1 reply; 9+ messages in thread
From: Johannes Schindelin @ 2009-10-12 16:05 UTC (permalink / raw)
To: Sal Mangano; +Cc: git
Hi,
On Mon, 12 Oct 2009, Sal Mangano wrote:
> Shawn O. Pearce <spearce <at> spearce.org> writes:
>
> >
> > Sal Mangano <smangano <at> into-technology.com> wrote:
> > > I am using Git in a non-standard way and need to make a few twaeks
> > > in my custom build. I have added a --nosort option to git mktree
> > > which will suppress the qsort of the tree.
> > >
> > > Will this break any other git functions? Are there any commands that
> > > assume trees are always sorted?
> >
> > _YES IT BREAKS GIT_.
> >
> > You cannot do this.
> >
> > A Git repository whose trees are not sorted according to the Git
> > specific sort ordering is severly broken and most tools will fail
> > horribly on it.
> >
> > Almost all code which reads trees assumes the names are sorted in a
> > specific order. These tools perform sorted merges against other tree
> > like structures. If the names are out of order the merge will fail.
> > `git fsck` will complain that the tree is not sorted properly. Tools
> > like `git log -- foo.c` will fail randomly because they break out of
> > the entry lookup as soon as they find a name that is after foo.c, as
> > they assume the tree is sorted.
> >
> > I could go on. But there is no point.
> >
> > Oh, and trust me when I say this, the tree sorting matters. Long ago
> > JGit had a bug where it didn't sort trees correctly all of the time
> > and we had a devil of a time tracking down that corruption.
> >
>
> Thanks Shawn. I get the picture.
>
> Now, let's assume I am stubborn, crazy or both :-)
>
> I can modify fsck to ignore unsorted and at the moment I don't care
> about merging trees. If I hunt down all usage of base_name_compare will
> that identify all code with the sort assumption or is there other places
> as well? I can go > through the entire source to figure this out myself
> but I need to get something hacked up very quickly and would appreciate
> help even if you think I am nuts!
Look, one of the most trusted Git contributors just told you that you are
asking for trouble.
It has nothing to do with being stubborn if you insist on doing it now.
But I smell an XY problem. Why don't you just reveil _what_ you want to
do (as opposed to _how_ you think you should do it)?
Ciao,
Dscho
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Supressing sorting of trees
2009-10-12 16:05 ` Johannes Schindelin
@ 2009-10-12 16:51 ` Sal Mangano
2009-10-12 19:36 ` Martin Langhoff
2009-10-13 20:49 ` Ealdwulf Wuffinga
0 siblings, 2 replies; 9+ messages in thread
From: Sal Mangano @ 2009-10-12 16:51 UTC (permalink / raw)
To: git
Johannes Schindelin <Johannes.Schindelin <at> gmx.de> writes:
>
> Hi,
>
> On Mon, 12 Oct 2009, Sal Mangano wrote:
>
> > Shawn O. Pearce <spearce <at> spearce.org> writes:
> >
> > >
> > > Sal Mangano <smangano <at> into-technology.com> wrote:
> > > > I am using Git in a non-standard way and need to make a few twaeks
> > > > in my custom build. I have added a --nosort option to git mktree
> > > > which will suppress the qsort of the tree.
> > > >
> > > > Will this break any other git functions? Are there any commands that
> > > > assume trees are always sorted?
> > >
> > > _YES IT BREAKS GIT_.
> > >
> > > You cannot do this.
> > >
> > > A Git repository whose trees are not sorted according to the Git
> > > specific sort ordering is severly broken and most tools will fail
> > > horribly on it.
> > >
> > > Almost all code which reads trees assumes the names are sorted in a
> > > specific order. These tools perform sorted merges against other tree
> > > like structures. If the names are out of order the merge will fail.
> > > `git fsck` will complain that the tree is not sorted properly. Tools
> > > like `git log -- foo.c` will fail randomly because they break out of
> > > the entry lookup as soon as they find a name that is after foo.c, as
> > > they assume the tree is sorted.
> > >
> > > I could go on. But there is no point.
> > >
> > > Oh, and trust me when I say this, the tree sorting matters. Long ago
> > > JGit had a bug where it didn't sort trees correctly all of the time
> > > and we had a devil of a time tracking down that corruption.
> > >
> >
> > Thanks Shawn. I get the picture.
> >
> > Now, let's assume I am stubborn, crazy or both
> >
> > I can modify fsck to ignore unsorted and at the moment I don't care
> > about merging trees. If I hunt down all usage of base_name_compare will
> > that identify all code with the sort assumption or is there other places
> > as well? I can go > through the entire source to figure this out myself
> > but I need to get something hacked up very quickly and would appreciate
> > help even if you think I am nuts!
>
> Look, one of the most trusted Git contributors just told you that you are
> asking for trouble.
>
> It has nothing to do with being stubborn if you insist on doing it now.
>
> But I smell an XY problem. Why don't you just reveil _what_ you want to
> do (as opposed to _how_ you think you should do it)?
>
> Ciao,
> Dscho
>
>
My apologies for being cryptic.
I am working on a project where I need to create a repository consisting of
hierarchical "blobs" of content (sound familiar?). In this repository the
order of the blobs as specified by the end user is definitely important.
However, I have a bunch of other reqs that fit Git perfectly such as the
ability to quickly tell if two trees are the same using their SHA1 and the
ability to version control the repository. My repository has no relationship
to files stored on a file system unlike a typical use of Git. I also don't
care about whether my repository remains compatible with standard Git because
no one will access this repository using standard Git.
Now I can proceed in a few ways:
1) I can write by repository from scratch.
2) I can use Git unchanged but preserve order by storing some information in
each sub tree (e.g. an extra blob) which retains the real order. I can also
store this information once for the whole "chunks" of the repository.
3) I can change Git to suite my needs understanding that it is not Git
anymore.
For me, (1) makes no sense at this time. I started with the hope that (2)
would work but realized it is very awkward and will cause performance problems
because it means most updates where ordering matters will have to update the
Git trees and my private ordering blob(s). So, after a quick look at the
source code it seemed like hacking Git into what I wanted was easier than 1
or 2.
I realized tree merge would probably break and wanted to know what
else. It is good to know fsck breaks. What else will break that I have to
deal with?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Supressing sorting of trees
2009-10-12 16:51 ` Sal Mangano
@ 2009-10-12 19:36 ` Martin Langhoff
2009-10-12 20:02 ` Salvatore Mangano
2009-10-13 20:49 ` Ealdwulf Wuffinga
1 sibling, 1 reply; 9+ messages in thread
From: Martin Langhoff @ 2009-10-12 19:36 UTC (permalink / raw)
To: Sal Mangano; +Cc: git
On Mon, Oct 12, 2009 at 6:51 PM, Sal Mangano
<smangano@into-technology.com> wrote:
> 2) I can use Git unchanged but preserve order by storing some information in
> each sub tree (e.g. an extra blob) which retains the real order. I can also
This #2 is your best bet by far. An extra blob in each subdir is just
one option, you can handle this "extra metadata" in a number of ways
-- maybe external to git, on a separate history will work best.
The downsides of messing with internal tree handling of git are so
staggering that you'd do better to throw git away.
(this is from experience of abusing git to various purposes that have
little to do with version control :-) )
In other words: Shaun and Dscho are right, so right that it hurts.
hth,
m
--
martin.langhoff@gmail.com
martin@laptop.org -- School Server Architect
- ask interesting questions
- don't get distracted with shiny stuff - working code first
- http://wiki.laptop.org/go/User:Martinlanghoff
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Supressing sorting of trees
2009-10-12 19:36 ` Martin Langhoff
@ 2009-10-12 20:02 ` Salvatore Mangano
2009-10-12 20:24 ` Martin Langhoff
0 siblings, 1 reply; 9+ messages in thread
From: Salvatore Mangano @ 2009-10-12 20:02 UTC (permalink / raw)
To: Martin Langhoff; +Cc: git
On Oct 12, 2009, at 3:36 PM, Martin Langhoff wrote:
> On Mon, Oct 12, 2009 at 6:51 PM, Sal Mangano
> <smangano@into-technology.com> wrote:
>> 2) I can use Git unchanged but preserve order by storing some
>> information in
>> each sub tree (e.g. an extra blob) which retains the real order. I
>> can also
>
> This #2 is your best bet by far. An extra blob in each subdir is just
> one option, you can handle this "extra metadata" in a number of ways
> -- maybe external to git, on a separate history will work best.
>
> The downsides of messing with internal tree handling of git are so
> staggering that you'd do better to throw git away.
>
> (this is from experience of abusing git to various purposes that have
> little to do with version control :-) )
>
> In other words: Shaun and Dscho are right, so right that it hurts.
>
Thanks Martin. I suspect you, Shaun and Dscho are correct. But, can
anyone point to specific code
that would allow me to see first hand that this is hopeless. So far,
based on the code I looked at, I see
it as problematic but not hopeless. Here I define "problematic" as
having to change a few files and/or
avoid using some features while "hopeless" meaning I'd have to change
almost very single plumbing
command.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Supressing sorting of trees
2009-10-12 20:02 ` Salvatore Mangano
@ 2009-10-12 20:24 ` Martin Langhoff
0 siblings, 0 replies; 9+ messages in thread
From: Martin Langhoff @ 2009-10-12 20:24 UTC (permalink / raw)
To: Salvatore Mangano; +Cc: git
On Mon, Oct 12, 2009 at 10:02 PM, Salvatore Mangano
<smangano@into-technology.com> wrote:
> point to specific code
Shaun pointed out some very core code. And it is just a core concept.
Just read up on the core organizing concept that is the "tree". Git
relies on the layout of the tree being strictly deterministic.
It is a prevalent assumption in the whole codebase.
Yes you can change it, but assume you will have to audit/rewrite 80%
of the core code.
Want "proof"? Go change it, then try "make test", or reimport a large
repository try to use the git commands over it. We'll relax and watch
the fireworks :-)
m
--
martin.langhoff@gmail.com
martin@laptop.org -- School Server Architect
- ask interesting questions
- don't get distracted with shiny stuff - working code first
- http://wiki.laptop.org/go/User:Martinlanghoff
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Supressing sorting of trees
2009-10-12 16:51 ` Sal Mangano
2009-10-12 19:36 ` Martin Langhoff
@ 2009-10-13 20:49 ` Ealdwulf Wuffinga
1 sibling, 0 replies; 9+ messages in thread
From: Ealdwulf Wuffinga @ 2009-10-13 20:49 UTC (permalink / raw)
To: Sal Mangano; +Cc: git
On Mon, Oct 12, 2009 at 5:51 PM, Sal Mangano
<smangano@into-technology.com> wrote:
> 1) I can write by repository from scratch.
> 2) I can use Git unchanged but preserve order by storing some information in
> each sub tree (e.g. an extra blob) which retains the real order. I can also
> store this information once for the whole "chunks" of the repository.
> 3) I can change Git to suite my needs understanding that it is not Git
> anymore.
>
> For me, (1) makes no sense at this time. I started with the hope that (2)
> would work but realized it is very awkward and will cause performance problems
> because it means most updates where ordering matters will have to update the
> Git trees and my private ordering blob(s). So, after a quick look at the
> source code it seemed like hacking Git into what I wanted was easier than 1
> or 2.
You could add a prefix to the names so you get the order you want. Eg:
a-foo
b-bar
c-baz
If you need to move foo to between bar and baz, you just rename it to
ba-foo, etc.
Ealdwulf
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-10-13 20:58 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-12 13:27 Supressing sorting of trees Sal Mangano
2009-10-12 14:20 ` Shawn O. Pearce
2009-10-12 15:43 ` Sal Mangano
2009-10-12 16:05 ` Johannes Schindelin
2009-10-12 16:51 ` Sal Mangano
2009-10-12 19:36 ` Martin Langhoff
2009-10-12 20:02 ` Salvatore Mangano
2009-10-12 20:24 ` Martin Langhoff
2009-10-13 20:49 ` Ealdwulf Wuffinga
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).