* lexicographic ordering is not always best
@ 2003-02-17 23:38 Hans Reiser
2003-02-18 10:54 ` Nikita Danilov
0 siblings, 1 reply; 3+ messages in thread
From: Hans Reiser @ 2003-02-17 23:38 UTC (permalink / raw)
To: nikita; +Cc: ReiserFS
Suppose that you have small directories, such that the time to do linear
searching within the directory is not significant.
Suppose that you have a tendency to access files in readdir() order, and
having files laid out in an order that is the same as the directory
order is performance valuable.
Suppose that you create files too slowly for allocate on flush to fix
this problem, and access them too soon for the repacker to fix this problem.
In that case, ordering both directory entries and file bodies in a first
created first ordered order is optimal.
How much work would it be to create a reiser4 directory plugin to order
in creation time order? Could you do this by simply setting the hash
field always to zero for that plugin, and letting the duplicate key code
handle things? If it is trivial to do, it might be useful. Especially
for the analysis of the performance of our algorithms on various benchmarks.
Are you ready to work on implementing file body key assignment in order
of directory entries?
--
Hans
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: lexicographic ordering is not always best
2003-02-17 23:38 lexicographic ordering is not always best Hans Reiser
@ 2003-02-18 10:54 ` Nikita Danilov
2003-02-18 11:10 ` Hans Reiser
0 siblings, 1 reply; 3+ messages in thread
From: Nikita Danilov @ 2003-02-18 10:54 UTC (permalink / raw)
To: Hans Reiser; +Cc: ReiserFS
Hans Reiser writes:
> Suppose that you have small directories, such that the time to do linear
> searching within the directory is not significant.
>
> Suppose that you have a tendency to access files in readdir() order, and
> having files laid out in an order that is the same as the directory
> order is performance valuable.
>
> Suppose that you create files too slowly for allocate on flush to fix
> this problem, and access them too soon for the repacker to fix this problem.
>
> In that case, ordering both directory entries and file bodies in a first
> created first ordered order is optimal.
>
> How much work would it be to create a reiser4 directory plugin to order
> in creation time order? Could you do this by simply setting the hash
> field always to zero for that plugin, and letting the duplicate key code
> handle things? If it is trivial to do, it might be useful. Especially
> for the analysis of the performance of our algorithms on various benchmarks.
I do not see how this is possible. Lookup has to find the directory
entry given a name. If directory entries are ordered in the "creation"
order, some component(s) of directory entry key (ones that order
directory entry by creation) are unknown at the lookup time.
The only solutions that come to my mind are either double index
directory entries (this will solve a bunch of other problems as well),
or resort to a sequential scan (this is exactly what ext2 does, and, by
the way ext2 also mostly sorts directory entries by the creation time).
Have I missed something in your proposal?
Nikita.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: lexicographic ordering is not always best
2003-02-18 10:54 ` Nikita Danilov
@ 2003-02-18 11:10 ` Hans Reiser
0 siblings, 0 replies; 3+ messages in thread
From: Hans Reiser @ 2003-02-18 11:10 UTC (permalink / raw)
To: Nikita Danilov; +Cc: ReiserFS
Nikita Danilov wrote:
>Hans Reiser writes:
> > Suppose that you have small directories, such that the time to do linear
> > searching within the directory is not significant.
> >
> > Suppose that you have a tendency to access files in readdir() order, and
> > having files laid out in an order that is the same as the directory
> > order is performance valuable.
> >
> > Suppose that you create files too slowly for allocate on flush to fix
> > this problem, and access them too soon for the repacker to fix this problem.
> >
> > In that case, ordering both directory entries and file bodies in a first
> > created first ordered order is optimal.
> >
> > How much work would it be to create a reiser4 directory plugin to order
> > in creation time order? Could you do this by simply setting the hash
> > field always to zero for that plugin, and letting the duplicate key code
> > handle things? If it is trivial to do, it might be useful. Especially
> > for the analysis of the performance of our algorithms on various benchmarks.
>
>I do not see how this is possible. Lookup has to find the directory
>entry given a name. If directory entries are ordered in the "creation"
>order, some component(s) of directory entry key (ones that order
>directory entry by creation) are unknown at the lookup time.
>
No, they are 0 (identical, duplicates).
>
>The only solutions that come to my mind are either double index
>directory entries (this will solve a bunch of other problems as well),
>or resort to a sequential scan (this is exactly what ext2 does, and, by
>the way ext2 also mostly sorts directory entries by the creation time).
>
Yes, sequential scan is what I meant. I do not intend this as the
default, but merely as an option.
>
>Have I missed something in your proposal?
>
>Nikita.
>
>
>
>
--
Hans
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2003-02-18 11:10 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-02-17 23:38 lexicographic ordering is not always best Hans Reiser
2003-02-18 10:54 ` Nikita Danilov
2003-02-18 11:10 ` Hans Reiser
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.