Re: Yet another base64 patch

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: Yet another base64 patch
  2005-04-14  4:19 Yet another base64 patch H. Peter Anvin
@ 2005-04-14  2:24 ` Christopher Li
  2005-04-14  5:36   ` H. Peter Anvin
  2005-04-14  4:25 ` H. Peter Anvin
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 34+ messages in thread
From: Christopher Li @ 2005-04-14  2:24 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: git

On Wed, Apr 13, 2005 at 09:19:48PM -0700, H. Peter Anvin wrote:
> Checking out the total kernel tree (time checkout-cache -a into an empty 
> directory):
> 
> 	Cache cold	Cache hot
> stock	3:46.95		19.95
> base64	5:56.20		23.74
> flat	2:44.13		15.68


> It seems that the flat format, at least on ext3 with dircache, is 
> actually a major performance win, and that the second level loses quite 
> a bit.

That is not surprising due to the directory index in ext3. Htree is pretty
good at random access and the hashed file name distribute evenly, that is
the best case for htree. 

Chris


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14  5:36   ` H. Peter Anvin
@ 2005-04-14  2:42     ` Christopher Li
  2005-04-14  6:27       ` H. Peter Anvin
  0 siblings, 1 reply; 34+ messages in thread
From: Christopher Li @ 2005-04-14  2:42 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: git

On Wed, Apr 13, 2005 at 10:36:52PM -0700, H. Peter Anvin wrote:
> Christopher Li wrote:
> >On Wed, Apr 13, 2005 at 09:19:48PM -0700, H. Peter Anvin wrote:
> >
> >That is not surprising due to the directory index in ext3. Htree is pretty
> >good at random access and the hashed file name distribute evenly, that is
> >the best case for htree. 
> >
> 
> Right, so by not trying to do the filesystem's job for it we actually 
> come out ahead.
>

But if you write a large number of random files, when htree has three
levels index. htree will suffer on the effect that it dirty random block
very quickly, most block get dirty only contain one or two new entries.
Ext3 will choke on it due to the limited journal size.

While non-index directory, new entry are very compact on the blocks.
So it end up dirty a lot less blocks, of course, lookup will suffer.

Depend on you want check out fast or write a big tree fast, you can't
win it all.

Chris

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Yet another base64 patch
@ 2005-04-14  4:19 H. Peter Anvin
  2005-04-14  2:24 ` Christopher Li
                   ` (3 more replies)
  0 siblings, 4 replies; 34+ messages in thread
From: H. Peter Anvin @ 2005-04-14  4:19 UTC (permalink / raw)
  To: git

I am assuming this will be the last one one way or another...

I decided that filenames/tags beginning with - was a really bad thing,
so I decided that, ugly though it might be, the best was to do a hybrid
between regular base64 (+ /) and filesystem-safe base64 (- _) and use
+ _ as the nonalpha characters needed.  I have updated the base64
patches as well as gitcvt, and also put out a flat version of gitcvt.

gitcvt also now converts the HEAD file over.  This requires pointing it
at the .dircache/.git directory instead of the objects directory inside.
  I have tested it on both the git and the kernel-test repositories.

Checking out the total kernel tree (time checkout-cache -a into an empty 
directory):

	Cache cold	Cache hot
stock	3:46.95		19.95
base64	5:56.20		23.74
flat	2:44.13		15.68

It seems that the flat format, at least on ext3 with dircache, is 
actually a major performance win, and that the second level loses quite 
a bit.

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14  4:19 Yet another base64 patch H. Peter Anvin
  2005-04-14  2:24 ` Christopher Li
@ 2005-04-14  4:25 ` H. Peter Anvin
  2005-04-14  8:17 ` Linus Torvalds
  2005-04-15 23:55 ` Paul Dickson
  3 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2005-04-14  4:25 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: git

H. Peter Anvin wrote:
> 
> It seems that the flat format, at least on ext3 with dircache, is 
> actually a major performance win, and that the second level loses quite 
> a bit.
> 

s/dircache/dir_index/

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14  2:24 ` Christopher Li
@ 2005-04-14  5:36   ` H. Peter Anvin
  2005-04-14  2:42     ` Christopher Li
  0 siblings, 1 reply; 34+ messages in thread
From: H. Peter Anvin @ 2005-04-14  5:36 UTC (permalink / raw)
  To: Christopher Li; +Cc: git

Christopher Li wrote:
> On Wed, Apr 13, 2005 at 09:19:48PM -0700, H. Peter Anvin wrote:
> 
>>Checking out the total kernel tree (time checkout-cache -a into an empty 
>>directory):
>>
>>	Cache cold	Cache hot
>>stock	3:46.95		19.95
>>base64	5:56.20		23.74
>>flat	2:44.13		15.68
> 
>>It seems that the flat format, at least on ext3 with dircache, is 
>>actually a major performance win, and that the second level loses quite 
>>a bit.
> 
> That is not surprising due to the directory index in ext3. Htree is pretty
> good at random access and the hashed file name distribute evenly, that is
> the best case for htree. 
> 

Right, so by not trying to do the filesystem's job for it we actually 
come out ahead.

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14  2:42     ` Christopher Li
@ 2005-04-14  6:27       ` H. Peter Anvin
  2005-04-14  6:35         ` H. Peter Anvin
  2005-04-14  7:40         ` Linus Torvalds
  0 siblings, 2 replies; 34+ messages in thread
From: H. Peter Anvin @ 2005-04-14  6:27 UTC (permalink / raw)
  To: Christopher Li; +Cc: git

Christopher Li wrote:
> 
> But if you write a large number of random files, when htree has three
> levels index. htree will suffer on the effect that it dirty random block
> very quickly, most block get dirty only contain one or two new entries.
> Ext3 will choke on it due to the limited journal size.
> 
> While non-index directory, new entry are very compact on the blocks.
> So it end up dirty a lot less blocks, of course, lookup will suffer.
> 
> Depend on you want check out fast or write a big tree fast, you can't
> win it all.
> 

Actually, the subdirectory hack has the same effect, so you lose 
regardless.  Doesn't mean that you can't construct cases where the 
subdirectory hack doesn't win, but I maintain that those are likely to 
be artificial.

It's probably worth noting that you have to assume htree is on, since 
that's the typical default for a Linux installation, even if you use the 
subdirectory hack.

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14  6:27       ` H. Peter Anvin
@ 2005-04-14  6:35         ` H. Peter Anvin
  2005-04-14  7:40         ` Linus Torvalds
  1 sibling, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2005-04-14  6:35 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Christopher Li, git

H. Peter Anvin wrote:

> 
> Actually, the subdirectory hack has the same effect, so you lose 
> regardless.  Doesn't mean that you can't construct cases where the 
> subdirectory hack doesn't win, but I maintain that those are likely to 
> be artificial.
> 

That should, of course, be "... where the subdirectory hack does win ..."

Really, the subdirectory hack is a workaround for broken filesystems, 
and we don't use those anymore.

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14  6:27       ` H. Peter Anvin
  2005-04-14  6:35         ` H. Peter Anvin
@ 2005-04-14  7:40         ` Linus Torvalds
  2005-04-14 16:58           ` H. Peter Anvin
  1 sibling, 1 reply; 34+ messages in thread
From: Linus Torvalds @ 2005-04-14  7:40 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Christopher Li, git

On Wed, 13 Apr 2005, H. Peter Anvin wrote:
> 
> Actually, the subdirectory hack has the same effect, so you lose 
> regardless.  Doesn't mean that you can't construct cases where the 
> subdirectory hack doesn't win, but I maintain that those are likely to 
> be artificial.

I'll tell you why a flat object directory format simply isn't an option.

Hint: maximum directory size. It's limited by n_link, and it's almost
universally a 16-bit number on Linux (and generally artifically limited to
32000 entries).

In other words, if you ever expect to have more than 32000 objects, a flat 
space simply isn't possible.

		Linus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14  4:19 Yet another base64 patch H. Peter Anvin
  2005-04-14  2:24 ` Christopher Li
  2005-04-14  4:25 ` H. Peter Anvin
@ 2005-04-14  8:17 ` Linus Torvalds
  2005-04-14 17:02   ` H. Peter Anvin
  2005-04-15 23:55 ` Paul Dickson
  3 siblings, 1 reply; 34+ messages in thread
From: Linus Torvalds @ 2005-04-14  8:17 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: git

On Wed, 13 Apr 2005, H. Peter Anvin wrote:
> 
> Checking out the total kernel tree (time checkout-cache -a into an empty 
> directory):
> 
> 		Cache cold	Cache hot
> stock		3:46.95		19.95
> base64	5:56.20		23.74
> flat		2:44.13		15.68

So why is "base64" worse than the stock one?

As mentioned, the "flat" version may be faster, but it really isn't an
option. 32000 objects is peanuts. Any respectable source tree may hit that
in a short time, and will break in horrible ways on many Linux
filesystems.

So you need at least a single level of subdirectory. 

What I don't get is why the stock hex version would be better than base64.

I like the result, I just don't _understand_ it.

		Linus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14  7:40         ` Linus Torvalds
@ 2005-04-14 16:58           ` H. Peter Anvin
  2005-04-14 17:42             ` Linus Torvalds
  0 siblings, 1 reply; 34+ messages in thread
From: H. Peter Anvin @ 2005-04-14 16:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christopher Li, git

Linus Torvalds wrote:
> 
> I'll tell you why a flat object directory format simply isn't an option.
> 
> Hint: maximum directory size. It's limited by n_link, and it's almost
> universally a 16-bit number on Linux (and generally artifically limited to
> 32000 entries).
> 
> In other words, if you ever expect to have more than 32000 objects, a flat 
> space simply isn't possible.
> 

Eh?!  n_link limits the number of *subdirectories* a directory can 
contain, not the number of *entries*.

	-hpa

	

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14  8:17 ` Linus Torvalds
@ 2005-04-14 17:02   ` H. Peter Anvin
  0 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2005-04-14 17:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds wrote:
> 
> So why is "base64" worse than the stock one?
> 
> As mentioned, the "flat" version may be faster, but it really isn't an
> option. 32000 objects is peanuts. Any respectable source tree may hit that
> in a short time, and will break in horrible ways on many Linux
> filesystems.
> 

If it does, it's not because of n_link; see previous email.

I have used ext2 filesystems with hundreds of thousands of files per 
directory back in 1996.  It was slow but didn't break anything.

The only filesystem I know of which has a 2^16 entry limit is FAT.

> So you need at least a single level of subdirectory. 
> 
> What I don't get is why the stock hex version would be better than base64.
> 
> I like the result, I just don't _understand_ it.

The base64 version has 2^12 subdirectories instead of 2^8 (I just used 2 
characters as the hash key just like the hex version.)  So it ascerbates 
the performance penalty of subdirectory hashing.

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14 16:58           ` H. Peter Anvin
@ 2005-04-14 17:42             ` Linus Torvalds
  2005-04-14 19:11               ` bert hubert
  0 siblings, 1 reply; 34+ messages in thread
From: Linus Torvalds @ 2005-04-14 17:42 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Christopher Li, git



On Thu, 14 Apr 2005, H. Peter Anvin wrote:
> 
> Eh?!  n_link limits the number of *subdirectories* a directory can 
> contain, not the number of *entries*.

Duh. I'm a git.

		Linus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14 17:42             ` Linus Torvalds
@ 2005-04-14 19:11               ` bert hubert
  2005-04-14 19:25                 ` H. Peter Anvin
  0 siblings, 1 reply; 34+ messages in thread
From: bert hubert @ 2005-04-14 19:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: H. Peter Anvin, Christopher Li, git

On Thu, Apr 14, 2005 at 10:42:56AM -0700, Linus Torvalds wrote:
> > Eh?!  n_link limits the number of *subdirectories* a directory can 
> > contain, not the number of *entries*.
> 
> Duh. I'm a git.

That may be true :-), but from the "front lines" I can report that
directories with > 32000 or > 65000 entries is *asking* for trouble. There
is a whole chain of systems that need to get things right for huge
directories to work well, and it often is not that way.

Even though it should be.

So the question is, should git be the harbringer of improvements in this
area, or should it go with the flow.

Bert.

-- 
http://www.PowerDNS.com      Open source, database driven DNS Software 
http://netherlabs.nl              Open and Closed source services

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14 19:11               ` bert hubert
@ 2005-04-14 19:25                 ` H. Peter Anvin
  2005-04-14 21:47                   ` bert hubert
  0 siblings, 1 reply; 34+ messages in thread
From: H. Peter Anvin @ 2005-04-14 19:25 UTC (permalink / raw)
  To: bert hubert; +Cc: Linus Torvalds, Christopher Li, git

bert hubert wrote:
> 
> That may be true :-), but from the "front lines" I can report that
> directories with > 32000 or > 65000 entries is *asking* for trouble. There
> is a whole chain of systems that need to get things right for huge
> directories to work well, and it often is not that way.
> 

Specifics, please?

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14 19:25                 ` H. Peter Anvin
@ 2005-04-14 21:47                   ` bert hubert
  2005-04-15  0:44                     ` Linus Torvalds
  0 siblings, 1 reply; 34+ messages in thread
From: bert hubert @ 2005-04-14 21:47 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Linus Torvalds, Christopher Li, git

On Thu, Apr 14, 2005 at 12:25:40PM -0700, H. Peter Anvin wrote:
> >That may be true :-), but from the "front lines" I can report that
> >directories with > 32000 or > 65000 entries is *asking* for trouble. There
> >is a whole chain of systems that need to get things right for huge
> >directories to work well, and it often is not that way.
> >
> 
> Specifics, please?

We've seen even Linus assume there is a 65K limit, and it appears more
people have been confused.

The systems I've seen mess this up include backup tools (quite serious ones
too), NetApp NFS servers, Samba shares and archivers.

Some tools just fail visibly, which is good, others become so slow as to
effectively lock up, which was the case with the backup tools. 

I've quite often been able to fix broken systems by hashing directories -
many problems just vanish. 

It is too easy to get into a O(N^2) situation. Git may be able to deal with
it but you may hurt yourself when making backups, or if you ever want to
share your tree (possibly with yourself) over the network.

But if you live in an all Linux world, and use mostly tar and rsync, it
should work.

Bert.

-- 
http://www.PowerDNS.com      Open source, database driven DNS Software 
http://netherlabs.nl              Open and Closed source services

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14 21:47                   ` bert hubert
@ 2005-04-15  0:44                     ` Linus Torvalds
  2005-04-15  1:06                       ` H. Peter Anvin
  2005-04-15  1:07                       ` H. Peter Anvin
  0 siblings, 2 replies; 34+ messages in thread
From: Linus Torvalds @ 2005-04-15  0:44 UTC (permalink / raw)
  To: bert hubert; +Cc: H. Peter Anvin, Christopher Li, git



On Thu, 14 Apr 2005, bert hubert wrote:
> 
> It is too easy to get into a O(N^2) situation. Git may be able to deal with
> it but you may hurt yourself when making backups, or if you ever want to
> share your tree (possibly with yourself) over the network.

Even something as simple as "ls -l" has been known to have O(n**2)  
behaviour for big directories.

		Linus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-15  0:44                     ` Linus Torvalds
@ 2005-04-15  1:06                       ` H. Peter Anvin
  2005-04-17  4:10                         ` David Lang
  2005-04-15  1:07                       ` H. Peter Anvin
  1 sibling, 1 reply; 34+ messages in thread
From: H. Peter Anvin @ 2005-04-15  1:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bert hubert, Christopher Li, git

Linus Torvalds wrote:
> 
> Even something as simple as "ls -l" has been known to have O(n**2)  
> behaviour for big directories.
> 

For filesystems with linear directories, sure.  For sane filesystems, it 
should have O(n log n).

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-15  0:44                     ` Linus Torvalds
  2005-04-15  1:06                       ` H. Peter Anvin
@ 2005-04-15  1:07                       ` H. Peter Anvin
  2005-04-15  3:58                         ` Paul Jackson
  1 sibling, 1 reply; 34+ messages in thread
From: H. Peter Anvin @ 2005-04-15  1:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bert hubert, Christopher Li, git

Linus Torvalds wrote:
> 
> On Thu, 14 Apr 2005, bert hubert wrote:
> 
>>It is too easy to get into a O(N^2) situation. Git may be able to deal with
>>it but you may hurt yourself when making backups, or if you ever want to
>>share your tree (possibly with yourself) over the network.
> 
> 
> Even something as simple as "ls -l" has been known to have O(n**2)  
> behaviour for big directories.
> 

Ultimately the question is: do we care about old (broken) filesystems?

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-15  1:07                       ` H. Peter Anvin
@ 2005-04-15  3:58                         ` Paul Jackson
  2005-04-17  3:53                           ` David A. Wheeler
  0 siblings, 1 reply; 34+ messages in thread
From: Paul Jackson @ 2005-04-15  3:58 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: torvalds, ahu, git, git

Earlier, hpa wrote:
> The base64 version has 2^12 subdirectories instead of 2^8 (I just used 2 
> characters as the hash key just like the hex version.)

Later, hpa wrote:
> Ultimately the question is: do we care about old (broken) filesystems?

I'd imagine we care a little - just not alot.

I'd think that going to 2^12 subdirectories, which with 2^12 entries per
subdirectory gets us to 16 million files before the leaf directories get
bigger than the parent, is a good tradeoff.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-14  4:19 Yet another base64 patch H. Peter Anvin
                   ` (2 preceding siblings ...)
  2005-04-14  8:17 ` Linus Torvalds
@ 2005-04-15 23:55 ` Paul Dickson
  2005-04-18  6:28   ` H. Peter Anvin
  3 siblings, 1 reply; 34+ messages in thread
From: Paul Dickson @ 2005-04-15 23:55 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: git

On Wed, 13 Apr 2005 21:19:48 -0700, H. Peter Anvin wrote:

> Checking out the total kernel tree (time checkout-cache -a into an empty 
> directory):
> 
>         Cache cold      Cache hot
> stock   3:46.95         19.95
> base64  5:56.20         23.74
> flat    2:44.13         15.68
> 
> It seems that the flat format, at least on ext3 with dircache, is 
> actually a major performance win, and that the second level loses quite 
> a bit.

Since 160-bits does not go into base64 evenly anyways, what happens if
you use 2^10 instead of 2^12 for the subdir names?  That will be 1/4 the
directories of the base64 given above.

	-Paul


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-15  3:58                         ` Paul Jackson
@ 2005-04-17  3:53                           ` David A. Wheeler
  2005-04-17  4:05                             ` Paul Jackson
  0 siblings, 1 reply; 34+ messages in thread
From: David A. Wheeler @ 2005-04-17  3:53 UTC (permalink / raw)
  To: git

Paul Jackson wrote:
> Earlier, hpa wrote:
> 
>>The base64 version has 2^12 subdirectories instead of 2^8 (I just used 2 
>>characters as the hash key just like the hex version.)
> 
> Later, hpa wrote:
> 
>>Ultimately the question is: do we care about old (broken) filesystems?
> 
> 
> I'd imagine we care a little - just not alot.

Some people (e.g., me) would really like for "git"
to be more forgiving of nasty filesystems,
so that git can be used very widely.
I.E., be forgiving about case insensitivity,
poor performance or problems with a large # of files
in a directory, etc.  You're already working to make
sure git handles filenames with spaces & i18n filenames,
a common failing of many other SCM systems.

If "git" is used for Linux kernel development & nothing else,
it's still a success.  But it'd be even better from
my point of view if "git" was a useful tool for MANY
other projects.  I think there are advantages, even if you
only plan to use git for the kernel, to making "git" easier
to use for other projects.  By making git less
sensitive to the filesystem, you'll attract more (non-kernel-dev)
users, some of whom will become new git developers who
add cool new functionality.

As noted in my SCM survey (http://www.dwheeler.com/essays/scm.html),
I think SCM Windows support is really important to a lot of
OSS projects.  Many OSS projects, even if they start
Unix/Linux only, spin off a Windows port, and it's
painful if their SCM can't run on Windows then.
Problems running on NFS filesystems have caused problems
with GNU Arch users (there are workarounds, but now you
need to learn about workarounds instead of things
"just working").  If nothing else, look at the history
of other SCM projects: all too many have undergone radical and
painful surgeries so that they can be more portable to
various filesystems.

It's a trade-off, I know.

--- David A. Wheeler

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-17  3:53                           ` David A. Wheeler
@ 2005-04-17  4:05                             ` Paul Jackson
  2005-04-17  6:38                               ` David A. Wheeler
  2005-04-17 14:30                               ` Daniel Barkalow
  0 siblings, 2 replies; 34+ messages in thread
From: Paul Jackson @ 2005-04-17  4:05 UTC (permalink / raw)
  To: dwheeler; +Cc: git

David wrote:
> It's a trade-off, I know.

So where do you recommend we make that trade-off?

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-15  1:06                       ` H. Peter Anvin
@ 2005-04-17  4:10                         ` David Lang
  2005-04-18  6:23                           ` H. Peter Anvin
  0 siblings, 1 reply; 34+ messages in thread
From: David Lang @ 2005-04-17  4:10 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Linus Torvalds, bert hubert, Christopher Li, git

On Thu, 14 Apr 2005, H. Peter Anvin wrote:

> Linus Torvalds wrote:
>> 
>> Even something as simple as "ls -l" has been known to have O(n**2) 
>> behaviour for big directories.
>> 
>
> For filesystems with linear directories, sure.  For sane filesystems, it 
> should have O(n log n).

note that default configs of ext2 and ext3 don't qualify as sane 
filesystems by this definition.

ext3 does have an extention that you can enable to have it hash the 
directory access, but even if you enable that on a filesystem you aren't 
guaranteed that it will be active (if the directory existed before it was 
turned on, or has been accessed by a kernel that didn't understand the 
extention then the htree functionality won't be used until you manually 
tell the system to generate the tree)

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-17  4:05                             ` Paul Jackson
@ 2005-04-17  6:38                               ` David A. Wheeler
  2005-04-17  8:16                                 ` Paul Jackson
  2005-04-17 18:19                                 ` Petr Baudis
  2005-04-17 14:30                               ` Daniel Barkalow
  1 sibling, 2 replies; 34+ messages in thread
From: David A. Wheeler @ 2005-04-17  6:38 UTC (permalink / raw)
  To: Paul Jackson; +Cc: git

Paul Jackson wrote:
> David wrote:
> 
>>It's a trade-off, I know.
> 
> 
> So where do you recommend we make that trade-off?

I'd look at some of the more constraining, yet still
common cases, and make sure it worked reasonably
well without requiring magic. My list would be:
ext2, ext3, NFS, and Windows' NTFS (stupid short filenames,
case-insensitive/case-preserving).  Samba shouldn't be
more constraining than NTFS, and I would expect ReiserFS
wouldn't be a constraining case.  Bonus points if the
names lengths are inside POSIX guarantees, but I bet the
POSIX limits are so tiny as to be laughable.  Bonus points for
CD-ROM format with the Rock Ridge extensions (I _think_ DVDs
and later use that format too, yes?), though if that
didn't work tar files are an easy workaround. Imagine a full
Linux kernel source repository, for 30+ (pick a number) years..
can the filesystems handle the number of objects in those cases?
If it works, your infrastructure should be sufficiently
portable to "just work" on others too.

Anyway, my two cents.

--- David A. Wheeler

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-17  6:38                               ` David A. Wheeler
@ 2005-04-17  8:16                                 ` Paul Jackson
  2005-04-17 17:51                                   ` David A. Wheeler
  2005-04-17 18:19                                 ` Petr Baudis
  1 sibling, 1 reply; 34+ messages in thread
From: Paul Jackson @ 2005-04-17  8:16 UTC (permalink / raw)
  To: dwheeler; +Cc: git

David wrote:
> My list would be:
> ext2, ext3, NFS, and Windows' NTFS (stupid short filenames,
> case-insensitive/case-preserving).

I'm no mind reader, but I'd bet a pretty penny that what you have in
mind and what Linus has in mind have no overlaps in their solution sets.

Happy coding ...

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-17  4:05                             ` Paul Jackson
  2005-04-17  6:38                               ` David A. Wheeler
@ 2005-04-17 14:30                               ` Daniel Barkalow
  2005-04-17 16:29                                 ` David A. Wheeler
  1 sibling, 1 reply; 34+ messages in thread
From: Daniel Barkalow @ 2005-04-17 14:30 UTC (permalink / raw)
  To: Paul Jackson; +Cc: dwheeler, git

On Sat, 16 Apr 2005, Paul Jackson wrote:

> David wrote:
> > It's a trade-off, I know.
> 
> So where do you recommend we make that trade-off?

So why do we have to be consistant? It seems like we need a standard
format for these reasons:

 - We use rsync to interact with remote repositories, and rsync won't
   understand if they aren't organized the same way. But I'm working on
   having everything go through git-specific code, which could understand
   different layouts.

 - Everything that shares a local repository needs to understand the
   format of that repository. But the filesystem constraints on the local
   repository will be the same regardless of who is looking, so they'd all
   expect the same format anyway.

So my idea is, once we're using git-smart transfer code (which can verify
objects, etc.), add support for different implementations of 
sha1_file_name suitable for different filesystems, and vary based either
on a compile-time option or on a setting stored in the objects
directory. The only thing that matters is that repositories on
non-special web servers have a standard format, because they'll be serving
objects by URL, not by sha1.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-17 14:30                               ` Daniel Barkalow
@ 2005-04-17 16:29                                 ` David A. Wheeler
  0 siblings, 0 replies; 34+ messages in thread
From: David A. Wheeler @ 2005-04-17 16:29 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Paul Jackson, git

I wrote:
>>>It's a trade-off, I know.

Paul Jackson replied:
>>So where do you recommend we make that trade-off?

Daniel Barkalow wrote:
> So why do we have to be consistant? It seems like we need a standard
> format for these reasons:
> 
>  - We use rsync to interact with remote repositories, and rsync won't
>    understand if they aren't organized the same way. But I'm working on
>    having everything go through git-specific code, which could understand
>    different layouts.
> 
>  - Everything that shares a local repository needs to understand the
>    format of that repository. But the filesystem constraints on the local
>    repository will be the same regardless of who is looking, so they'd all
>    expect the same format anyway.
> 
> So my idea is, once we're using git-smart transfer code (which can verify
> objects, etc.), add support for different implementations of 
> sha1_file_name suitable for different filesystems, and vary based either
> on a compile-time option or on a setting stored in the objects
> directory.

I think that's the perfect answer: make it a setting stored
in the objects directory (presumably set during
initialization of the directory), and handled automagically
by the tools.  I recommend handling them NOT be a compile-time option,
so that the same set of tools works everywhere automatically
(who wants to recompile tools just to work on a different file layout?).


> The only thing that matters is that repositories on
> non-special web servers have a standard format, because they'll be serving
> objects by URL, not by sha1.

If the "layout info" is stored in a standard location for a
given repository, then the rest doesn't matter. The library would just
download that, then know how to find the rest.

--- David A. Wheeler

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-17  8:16                                 ` Paul Jackson
@ 2005-04-17 17:51                                   ` David A. Wheeler
  0 siblings, 0 replies; 34+ messages in thread
From: David A. Wheeler @ 2005-04-17 17:51 UTC (permalink / raw)
  To: Paul Jackson; +Cc: git

Paul Jackson wrote:
> David wrote:
> 
>>My list would be:
>>ext2, ext3, NFS, and Windows' NTFS (stupid short filenames,
>>case-insensitive/case-preserving).
> 
> 
> I'm no mind reader, but I'd bet a pretty penny that what you have in
> mind and what Linus has in mind have no overlaps in their solution sets.

Sadly, I lack the mind reading ability as well.

Our goals are, I suspect, somewhat different.
Linus wants to build a tool that meets his specific needs
(managing kernel development), and he has particular requirements
(such as fast simple merging when working at large scales).
In contrast, I'm hoping for a more
general OSS/FS SCM tool that many others can use as well.

But I think there's heavy overlap in the solution space.
The Linux kernel project is, to my knowledge, the largest
project using a truly distributed SCM process.
Anyone else who is considering a distributed SCM process
would at _least_ want to think about how the Linux kernel
project works, and if they're doing so, they
might also want to reuse the development tools.

I'm just taking a peek, and
looking for situations where a design decision is irrelevant
for his purposes, but a particular direction would be of
particular help to other projects.  I'm more worried about the
storage format; if the code doesn't support some particular
feature but it could be added later without great pain, no big deal.
If something would imply a complete rewrite, that's undesirable.

--- David A. Wheeler

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-17  6:38                               ` David A. Wheeler
  2005-04-17  8:16                                 ` Paul Jackson
@ 2005-04-17 18:19                                 ` Petr Baudis
  2005-04-18  5:13                                   ` David A. Wheeler
  1 sibling, 1 reply; 34+ messages in thread
From: Petr Baudis @ 2005-04-17 18:19 UTC (permalink / raw)
  To: David A. Wheeler; +Cc: Paul Jackson, git

Dear diary, on Sun, Apr 17, 2005 at 08:38:10AM CEST, I got a letter
where "David A. Wheeler" <dwheeler@dwheeler.com> told me that...
> I'd look at some of the more constraining, yet still
> common cases, and make sure it worked reasonably
> well without requiring magic. My list would be:
> ext2, ext3, NFS, and Windows' NTFS (stupid short filenames,
> case-insensitive/case-preserving).  Samba shouldn't be
> more constraining than NTFS, and I would expect ReiserFS
> wouldn't be a constraining case.  Bonus points if the
> names lengths are inside POSIX guarantees, but I bet the
> POSIX limits are so tiny as to be laughable.  Bonus points for
> CD-ROM format with the Rock Ridge extensions (I _think_ DVDs
> and later use that format too, yes?), though if that
> didn't work tar files are an easy workaround. Imagine a full
> Linux kernel source repository, for 30+ (pick a number) years..
> can the filesystems handle the number of objects in those cases?
> If it works, your infrastructure should be sufficiently
> portable to "just work" on others too.

I personally don't mind getting it work on more places, if it doesn't
make git work (measurably) worse on modern Linux systems, the code will
not go to hell, you tell me what needs to be done and preferably give me
the patches. ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-17 18:19                                 ` Petr Baudis
@ 2005-04-18  5:13                                   ` David A. Wheeler
  2005-04-18 12:59                                     ` Kevin Smith
  0 siblings, 1 reply; 34+ messages in thread
From: David A. Wheeler @ 2005-04-18  5:13 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Paul Jackson, git

I said:
>>I'd look at some of the more constraining, yet still
>>common cases, and make sure it worked reasonably
>>well without requiring magic. My list would be:
>>ext2, ext3, NFS, and Windows' NTFS (stupid short filenames,
>>case-insensitive/case-preserving).

Petr Baudis replied:
> I personally don't mind getting it work on more places, if it doesn't
> make git work (measurably) worse on modern Linux systems, the code will
> not go to hell, you tell me what needs to be done and preferably give me
> the patches. ;-)

Okay, that's great.

The one potential issue I know of (after trying to read from the
firehose^Wlist archives) is that some are worried about poor filesystems
when there are a large number of objects in an object directory.

After doing some calculations, it seems to me that perhaps this
isn't really such a big deal, if there's a top directory such as
the 16-bit (2-char) top directory currently in git-pasky.
Removing the top directory would improve performance for the better
filesystems, but would be an absolute KILLER to poorer systems, so
I'd keep the 2**8 top directory just as it is in git-pasky.
It's a compromise that means people can ease into git, and then
switch when their projects grow to large sizes.
My calculations are below, but I could be mistaken; let me
know if I'm all wet.

Does anyone know of any other issues in how git data is stored that
might cause problems for some situations?  Windows' case-insensitive/
case-preserving model for NTFS and vfat32 seems to be enough
(since the case is preserved) so that the format should work,
and you can just demand that
special git files use Unix formats ("/" as dir separator,
Unix end-of-lines).  The implementation currently would need
change to work easily on Windows (dealing with binary opens at least,
and probably rewriting the shell programs for those unwilling to
install Cygwin), but those can be done later if desired
without interfering with the interface formats.

========================= Details =========================

Basically, I'd like "git" to work on:
(1) nearly ANY system on small-to-medium projects,
     even if their filesystems do linear searches in directories,
     over a lengthy time.  Ideally possibly (though poorly)
     on larger systems.
(2) work well on large projects (e.g., kernel) on _common_
     development platforms (ext2, ext3, NTFS, NFS).

It all depends on what you're optimizing for; but humor me
if those were your requirements...

Case 1:
The top (2-char) directory appears likely to make small projects
perform okay, and large projects possible, on stupid filesystems.
The one level extra directory is actually not a bad compromise
to make things "just work" on just about anything for smaller scales.
* git-paskey (a tiny project) has ~2K objects in 2weeks; at that pace,
4Kobjects/month for 10 years, you'd have 480K objects.
That's absurd for even tiny projects, and it's unlikely that
a participant in a tiny project would be willing to change
filesystems just to participate.  But then if you
divide it among 256 directories = 1875 files/directory average.
Linear search is undesirable (about 1000 entry checks on
average to find each entry), but it's nowhere near the
2^16 dir entries that made people afraid.
Switching to a 2^12 top directory, you have an average of 117 entries
in each subdir (and 4096 entries at the top), yielding
an average of (117+4096)/2 = 2106 entry checks to find an entry.
* I estimated also for the big end, using the Linux kernel;
I guesstimated 36,000 objects/month for the kernel**. Over 10 years that
accumulates 4,320,000 objects, completely insane for a flat file
on a stupid filesystem. If it has a one-level 256dir directory, that's
16875 objects/directory.  Now THAT'S painful,
though nowhere near the 2^16 limit most quoted as bad.
* For 10K objects/month, and a top dir of 2**8, you have 1,200,000
objects; each dir has 4680 entries (average lookup: 2468 entries).
Dividing into 2**12 has 292/directory, average lookup: 2194.

On 2**12 vs. 2**8, it's not clear-cut. 2**8 works best for small
projects, 2**12 for larger.  My guess is that stupid filesystems
will tend to be used primarily only on small projects, so 2**8 might
be the better choice but that's debatable.

Case 2:
Thankfully adequate systems are finally more common, and they're
common enough that for really large projects (kernel) it seems
reasonable to demand such filesystems.
Ext2 & ext3 have had htree for a while now, and it's enabled by
default on at least Fedora Core 3.  If it's off, just do:
  tune2fs -O dir_index /dev/hdFOO; e2fsck -fD /dev/hdFOO
This stuff has been around so long that it should just be
a trivial command by any developer today.
ReiserFS has hashing too.  Windows' NTFS does
tree-balancing (it appears not as good as the hashing htree
system of ext2/ext3, but it should work tolerably since it's no
longer a linear search).  One useful factoid: For good NTFS
performance with git on large projects,
you should disable short name generation on the big directories
(Microsoft recommends this when >300,000 names are in one dir).
NTFS (and VFAT32) allow filenames up to 255 chars, and
filepaths up to 260 chars, so that seems okay.
I was primarily concerned about NTFS, and that seems to have
the necessities.  This info should in some FAQ or
documentation ("Using git for large projects").

It _seems_ to me that the NFS implementations are likely to
do similar things, but I don't know.  And I've not tested
anything on real systems, which is the real test.
Anyone know more about the limits of the NFS implementations?

More directory levels could be created to make
stupid filesystems happier, but that interferes with smart filesystems.
You could try to make filesystem layout a per-user issue,
but that makes using rsync more complicated.
A link farm could be created, though those are a pain to maintain.
It DOES turn out there are many alternatives if necesary, e.g.,
configurations per object database, or automatically "fixing"
things for a local configuration as data comes in or out,
though if you can avoid that it'd be better.

** Looking at "linux-2.4.0-to-2.6.12-rc2-patchset", I count
28237 patches; "RCS file:" occurs 188119 times & I'll claim
that that approximates the number of different file objects
IF there were no intermediate files.  If on average there are
5 versions of a file before it gets into the mainline,
and 3 commits before the final mainline patch, I get
approximately this many objects in a "real" object db:
  (28237*(3+1) trees) *2 (if #commits==#trees) +
  (188119*(5+1) file objs))
= 1,354,610 objects from 2002/02/05 to 2005/04/04
= about 36,000 objects/month.

Am I missing anything?

--- David A. Wheeler

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-17  4:10                         ` David Lang
@ 2005-04-18  6:23                           ` H. Peter Anvin
  0 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2005-04-18  6:23 UTC (permalink / raw)
  To: David Lang; +Cc: Linus Torvalds, bert hubert, Christopher Li, git

David Lang wrote:
> 
> note that default configs of ext2 and ext3 don't qualify as sane 
> filesystems by this definition.

Not using dir_index *IS* insane.

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-15 23:55 ` Paul Dickson
@ 2005-04-18  6:28   ` H. Peter Anvin
  0 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2005-04-18  6:28 UTC (permalink / raw)
  To: Paul Dickson; +Cc: git

Paul Dickson wrote:
> 
> Since 160-bits does not go into base64 evenly anyways, what happens if
> you use 2^10 instead of 2^12 for the subdir names?  That will be 1/4 the
> directories of the base64 given above.
> 

I was going to try one-character subdirs, so 2^6, but I haven't had a 
chance to do that since I'm at LCA.

Anyway, I'm starting to suspect it's too late to change the format, 
especially since Linus seems highly disinclined.

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-18  5:13                                   ` David A. Wheeler
@ 2005-04-18 12:59                                     ` Kevin Smith
  2005-04-18 16:42                                       ` David A. Wheeler
  0 siblings, 1 reply; 34+ messages in thread
From: Kevin Smith @ 2005-04-18 12:59 UTC (permalink / raw)
  Cc: git

David A. Wheeler wrote:
> Does anyone know of any other issues in how git data is stored that
> might cause problems for some situations?  Windows' case-insensitive/
> case-preserving model for NTFS and vfat32 seems to be enough
> (since the case is preserved) so that the format should work,

If git is retaining hex naming, and not moving to base64, then I don't
think what I am about to say is relevant. However, if base64 file naming
is still being considered, then vfat32 compatibility may be a concern
(I'm not sure about NTFS). Although it is case-preserving, it actually
considers both cases as being the same name. So AaA would overwrite aAa.

If I'm doing the math right, we would effectively be ignoring roughly
one out of 6 base64 bits. This would reduce the collision avoidance
capability of SHA-1 (on vfat32) from 160 bits to about 133 bits. Still
strong, and probably acceptable, but worth noting.

I'll take this opportunity to support David's position that it would be
fantastic if git could end up being valuable for a wide range of
projects, rather than just the kernel. I also fully understand that the
kernel is the primary target, but when there are opportunities to make
the data structures more generally useful without causing problems for
the kernel project, I hope they are taken.

Thanks,

Kevin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: Yet another base64 patch
  2005-04-18 12:59                                     ` Kevin Smith
@ 2005-04-18 16:42                                       ` David A. Wheeler
  0 siblings, 0 replies; 34+ messages in thread
From: David A. Wheeler @ 2005-04-18 16:42 UTC (permalink / raw)
  To: yarcs; +Cc: git

I asked:
> > Does anyone know of any other issues in how git data is stored that
> > might cause problems for some situations? ...

Kevin said:
> If git is retaining hex naming, and not moving to base64, then I don't
> think what I am about to say is relevant. However, if base64 file naming
> is still being considered, then vfat32 compatibility may be a concern
> (I'm not sure about NTFS).

I can't speak for the git developers. However, I think the current
naming scheme for the object database as used in git-pasky
is actually a very good one and should be left as-is
(SHA-1 hex values, directory of 2-char prefixes,
filenames with the rest of the value).

As far as I can tell from various calculations (& supported by the
performance measurements done by others), the hex values
with one level of directory turns out to work pretty well!
It's easily understood, works with non-massive projects on stupid
filesystems, and it has good performance on good filesystems
even with massive projects with huge histories.  You could
tune it further, but a single approach that works "everywhere"
is a whole lot simpler.  So I'd recommend keeping that
approach.

As far as base64/32 vs. hex names, I think there
are many reasons to stay with the hex names.
Using hex names is a good idea for the simple reason that
normally SHA-1 hashes are presented as hex values;
you'll work WITH instead of AGAINST other tools, and
humans who deal with this stuff will "see what they expect".
It takes a few more characters, but not many, and it's not
like base64 is any more comprehensible to humans.
And the fact that hex values don't allow "all" legal values
means that some errors are trivially detectable.

You're right, base64 eliminates many bits of differentiation,
and in a very non-obvious way (I _hate_ weird surprises like
that, they cause lots of trouble).  I think there's another
problem too that's more insideous. Although the _filesystem_
is case-preserving, I suspect some _tools_ on Windows don't take
care to preserve case.  If that's so, it'd be easily possible for a
Windows user to use some tools that screw up a Unix/Linux user
once they were imported, causing all sorts of "extraneous" files &
files that mysteriously disappeared (they were only accessible
from Windows). Ugh.
This can even happen on Unix/Linux systems if they use
a fileserver with NTFS semantics. In contrast,
if a hex value has its case changed, it's easy to fix locally.

By choosing the more traditional hex representation, you
eliminate lots of problems, and it's easier to explain too.

Kevin added:
> I'll take this opportunity to support David's position that it would be
> fantastic if git could end up being valuable for a wide range of
> projects, rather than just the kernel. I also fully understand that the
> kernel is the primary target, but when there are opportunities to make
> the data structures more generally useful without causing problems for
> the kernel project, I hope they are taken.

Thanks for the vote of confidence!

--- David A. Wheeler

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2005-04-18 16:38 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-14  4:19 Yet another base64 patch H. Peter Anvin
2005-04-14  2:24 ` Christopher Li
2005-04-14  5:36   ` H. Peter Anvin
2005-04-14  2:42     ` Christopher Li
2005-04-14  6:27       ` H. Peter Anvin
2005-04-14  6:35         ` H. Peter Anvin
2005-04-14  7:40         ` Linus Torvalds
2005-04-14 16:58           ` H. Peter Anvin
2005-04-14 17:42             ` Linus Torvalds
2005-04-14 19:11               ` bert hubert
2005-04-14 19:25                 ` H. Peter Anvin
2005-04-14 21:47                   ` bert hubert
2005-04-15  0:44                     ` Linus Torvalds
2005-04-15  1:06                       ` H. Peter Anvin
2005-04-17  4:10                         ` David Lang
2005-04-18  6:23                           ` H. Peter Anvin
2005-04-15  1:07                       ` H. Peter Anvin
2005-04-15  3:58                         ` Paul Jackson
2005-04-17  3:53                           ` David A. Wheeler
2005-04-17  4:05                             ` Paul Jackson
2005-04-17  6:38                               ` David A. Wheeler
2005-04-17  8:16                                 ` Paul Jackson
2005-04-17 17:51                                   ` David A. Wheeler
2005-04-17 18:19                                 ` Petr Baudis
2005-04-18  5:13                                   ` David A. Wheeler
2005-04-18 12:59                                     ` Kevin Smith
2005-04-18 16:42                                       ` David A. Wheeler
2005-04-17 14:30                               ` Daniel Barkalow
2005-04-17 16:29                                 ` David A. Wheeler
2005-04-14  4:25 ` H. Peter Anvin
2005-04-14  8:17 ` Linus Torvalds
2005-04-14 17:02   ` H. Peter Anvin
2005-04-15 23:55 ` Paul Dickson
2005-04-18  6:28   ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).