All of lore.kernel.org
 help / color / mirror / Atom feed
* lexicographic ordering is not always best
@ 2003-02-17 23:38 Hans Reiser
  2003-02-18 10:54 ` Nikita Danilov
  0 siblings, 1 reply; 3+ messages in thread
From: Hans Reiser @ 2003-02-17 23:38 UTC (permalink / raw)
  To: nikita; +Cc: ReiserFS

Suppose that you have small directories, such that the time to do linear 
searching within the directory is not significant.

Suppose that you have a tendency to access files in readdir() order, and 
having files laid out in an order that is the same as the directory 
order is performance valuable.

Suppose that you create files too slowly for allocate on flush to fix 
this problem, and access them too soon for the repacker to fix this problem.

In that case, ordering both directory entries and file bodies in a first 
created first ordered order is optimal.

How much work would it be to create a reiser4 directory plugin to order 
in creation time order?  Could you do this by simply setting the hash 
field always to zero for that plugin, and letting the duplicate key code 
handle things?  If it is trivial to do, it might be useful.  Especially 
for the analysis of the performance of our algorithms on various benchmarks.

Are you ready to work on implementing file body key assignment in order 
of directory entries?

-- 
Hans



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: lexicographic ordering is not always best
  2003-02-17 23:38 lexicographic ordering is not always best Hans Reiser
@ 2003-02-18 10:54 ` Nikita Danilov
  2003-02-18 11:10   ` Hans Reiser
  0 siblings, 1 reply; 3+ messages in thread
From: Nikita Danilov @ 2003-02-18 10:54 UTC (permalink / raw)
  To: Hans Reiser; +Cc: ReiserFS

Hans Reiser writes:
 > Suppose that you have small directories, such that the time to do linear 
 > searching within the directory is not significant.
 > 
 > Suppose that you have a tendency to access files in readdir() order, and 
 > having files laid out in an order that is the same as the directory 
 > order is performance valuable.
 > 
 > Suppose that you create files too slowly for allocate on flush to fix 
 > this problem, and access them too soon for the repacker to fix this problem.
 > 
 > In that case, ordering both directory entries and file bodies in a first 
 > created first ordered order is optimal.
 > 
 > How much work would it be to create a reiser4 directory plugin to order 
 > in creation time order?  Could you do this by simply setting the hash 
 > field always to zero for that plugin, and letting the duplicate key code 
 > handle things?  If it is trivial to do, it might be useful.  Especially 
 > for the analysis of the performance of our algorithms on various benchmarks.

I do not see how this is possible. Lookup has to find the directory
entry given a name. If directory entries are ordered in the "creation"
order, some component(s) of directory entry key (ones that order
directory entry by creation) are unknown at the lookup time.

The only solutions that come to my mind are either double index
directory entries (this will solve a bunch of other problems as well),
or resort to a sequential scan (this is exactly what ext2 does, and, by
the way ext2 also mostly sorts directory entries by the creation time).

Have I missed something in your proposal?

Nikita.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: lexicographic ordering is not always best
  2003-02-18 10:54 ` Nikita Danilov
@ 2003-02-18 11:10   ` Hans Reiser
  0 siblings, 0 replies; 3+ messages in thread
From: Hans Reiser @ 2003-02-18 11:10 UTC (permalink / raw)
  To: Nikita Danilov; +Cc: ReiserFS

Nikita Danilov wrote:

>Hans Reiser writes:
> > Suppose that you have small directories, such that the time to do linear 
> > searching within the directory is not significant.
> > 
> > Suppose that you have a tendency to access files in readdir() order, and 
> > having files laid out in an order that is the same as the directory 
> > order is performance valuable.
> > 
> > Suppose that you create files too slowly for allocate on flush to fix 
> > this problem, and access them too soon for the repacker to fix this problem.
> > 
> > In that case, ordering both directory entries and file bodies in a first 
> > created first ordered order is optimal.
> > 
> > How much work would it be to create a reiser4 directory plugin to order 
> > in creation time order?  Could you do this by simply setting the hash 
> > field always to zero for that plugin, and letting the duplicate key code 
> > handle things?  If it is trivial to do, it might be useful.  Especially 
> > for the analysis of the performance of our algorithms on various benchmarks.
>
>I do not see how this is possible. Lookup has to find the directory
>entry given a name. If directory entries are ordered in the "creation"
>order, some component(s) of directory entry key (ones that order
>directory entry by creation) are unknown at the lookup time.
>
No, they are 0 (identical, duplicates).

>
>The only solutions that come to my mind are either double index
>directory entries (this will solve a bunch of other problems as well),
>or resort to a sequential scan (this is exactly what ext2 does, and, by
>the way ext2 also mostly sorts directory entries by the creation time).
>
Yes, sequential scan is what I meant.  I do not intend this as the 
default, but merely as an option.

>
>Have I missed something in your proposal?
>
>Nikita.
>
>
>  
>


-- 
Hans



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-02-18 11:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-02-17 23:38 lexicographic ordering is not always best Hans Reiser
2003-02-18 10:54 ` Nikita Danilov
2003-02-18 11:10   ` Hans Reiser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.