* prio_tree generalization
@ 2004-07-05 1:24 Werner Almesberger
2004-07-05 2:07 ` Andrea Arcangeli
2004-07-05 4:46 ` Andrew Morton
0 siblings, 2 replies; 5+ messages in thread
From: Werner Almesberger @ 2004-07-05 1:24 UTC (permalink / raw)
To: Rajesh Venkatasubramanian; +Cc: linux-kernel
Hi Rajesh,
I'm currently experimenting with the prio_tree code in an elevator
("IO scheduler"), and I'm thinking about a way to avoid code
duplication.
The most straightforward approach seems to be to put everything
after prio_tree_init and before vma_prio_tree_add into a new file,
and #include that file. (And prio_tree_init should be shared.)
#including a .c file normally isn't exactly considered an epitome
of elegance, but in this case, there doesn't seem to be much of a
choice.
There's another issue: in the elevator, entries overlap only
rarely if at all, and it is sometimes useful to walk the tree in
sort order. As far as I can tell, RPSTs can be walked just like
RB trees if there are no overlaps on the path from the current to
the respective adjacent node.
Unfortunately, "prio_tree_next" is already taken. It would be nice
to follow the same naming scheme as RB trees, so perhaps
prio_tree_next could become prio_tree_more, or such ?
What do you think ?
- Werner
--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: prio_tree generalization
2004-07-05 1:24 prio_tree generalization Werner Almesberger
@ 2004-07-05 2:07 ` Andrea Arcangeli
2004-07-05 2:36 ` Werner Almesberger
2004-07-05 4:46 ` Andrew Morton
1 sibling, 1 reply; 5+ messages in thread
From: Andrea Arcangeli @ 2004-07-05 2:07 UTC (permalink / raw)
To: Werner Almesberger; +Cc: Rajesh Venkatasubramanian, linux-kernel
On Sun, Jul 04, 2004 at 10:24:38PM -0300, Werner Almesberger wrote:
> Hi Rajesh,
>
> I'm currently experimenting with the prio_tree code in an elevator
> ("IO scheduler"), and I'm thinking about a way to avoid code
> duplication.
that's a nice effort, I agree prio_tree.c is better suited for lib/ than
mm/ but the code already looks quite generic and well written.
>
> The most straightforward approach seems to be to put everything
> after prio_tree_init and before vma_prio_tree_add into a new file,
> and #include that file. (And prio_tree_init should be shared.)
>
> #including a .c file normally isn't exactly considered an epitome
> of elegance, but in this case, there doesn't seem to be much of a
> choice.
why don't you move the shared code to lib/prio_tree.c instead of
duplicating it in every object?
prio_tree_insert/prio_tree_remove/prio_tree_replace needs to be
exported etc..
> There's another issue: in the elevator, entries overlap only
> rarely if at all, and it is sometimes useful to walk the tree in
> sort order. As far as I can tell, RPSTs can be walked just like
> RB trees if there are no overlaps on the path from the current to
> the respective adjacent node.
>
> Unfortunately, "prio_tree_next" is already taken. It would be nice
> to follow the same naming scheme as RB trees, so perhaps
> prio_tree_next could become prio_tree_more, or such ?
I thought prio_tree_next was already the equivalent of rb_next for
prio-trees. The API is slightly different because you need an iterator
object, but I'm not sure how you want to change it to make it more
symmetric with rb_next.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: prio_tree generalization
2004-07-05 2:07 ` Andrea Arcangeli
@ 2004-07-05 2:36 ` Werner Almesberger
0 siblings, 0 replies; 5+ messages in thread
From: Werner Almesberger @ 2004-07-05 2:36 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: Rajesh Venkatasubramanian, linux-kernel
Andrea Arcangeli wrote:
> that's a nice effort, I agree prio_tree.c is better suited for lib/ than
> mm/ but the code already looks quite generic and well written.
The code is great, no problem there. But at some places, it needs
a macro that extracts the indices from each node. Callbacks are
likely to be too expensive, and e.g. with VMAs, the indices are
actually calculated, so just passing offsets wouldn't work
either.
> why don't you move the shared code to lib/prio_tree.c instead of
> duplicating it in every object?
Yes, there are some more functions that should be shareable,
i.e. prio_tree_replace, prio_tree_parent, and prio_tree_next.
Also prio_tree_expand might be a candidate.
But this still leaves a few that depend on GET_INDEX.
> I thought prio_tree_next was already the equivalent of rb_next for
> prio-trees.
Yeah, it kind of is, but I'm looking for something more
light-weight, that just gives me an adjacent node. Also, I
want to be able to go back. Here's my prio_tree_prev (minus
the comments). It should look familiar to you :-)
struct prio_tree_node *prio_tree_succ(struct prio_tree_node *node)
{
if (!prio_tree_right_empty(node)) {
node = node->right;
while (!prio_tree_left_empty(node))
node = node->left;
return node;
}
while (!prio_tree_no_parent(node) && node == node->parent->right)
node = node->parent;
return prio_tree_no_parent(node) ? NULL : node->parent;
}
Of course, this kind of iteration only makes sense if your tree
isn't just a bag of random ranges.
- Werner
--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: prio_tree generalization
2004-07-05 1:24 prio_tree generalization Werner Almesberger
2004-07-05 2:07 ` Andrea Arcangeli
@ 2004-07-05 4:46 ` Andrew Morton
2004-07-05 11:05 ` Werner Almesberger
1 sibling, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2004-07-05 4:46 UTC (permalink / raw)
To: Werner Almesberger; +Cc: vrajesh, linux-kernel
Werner Almesberger <wa@almesberger.net> wrote:
>
> I'm currently experimenting with the prio_tree code in an elevator
> ("IO scheduler"),
Offtopic, but that's a premature optmztn. O(n) linear searches work just
fine for disk elevators under most circumstances - we don't get may
complaints about CPU consumption in the 2.4 elevator.
A disk isn't going to retire more than 100 requests/sec in practice, and
the cost of an all-requests search is relatively small.
Once the new design is settled in, is proven to be useful and desirable,
that's the time to start thinking about millioptimisations such as
converting the search complexity from O(n) to O(log(n)).
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: prio_tree generalization
2004-07-05 4:46 ` Andrew Morton
@ 2004-07-05 11:05 ` Werner Almesberger
0 siblings, 0 replies; 5+ messages in thread
From: Werner Almesberger @ 2004-07-05 11:05 UTC (permalink / raw)
To: Andrew Morton; +Cc: vrajesh, linux-kernel
Andrew Morton wrote:
> Offtopic, but that's a premature optmztn.
Hmm, maybe. But it's actually not so much more work to use a
tree than it is to use a linear list. Particularly prio_tree is
very light on its users, because - unlike with RB trees - you
don't have to code all the lookups.
It may actually be nice to have something like a set of
includable functions using some GET_INDEX macro also for RB
trees, to make the easy cases even easier to write.
> A disk isn't going to retire more than 100 requests/sec in practice, and
> the cost of an all-requests search is relatively small.
On admittedly not very impressive hardware (a 1.2 GHz Duron), I
see about 60-100 us processing time per request submission in a
random access test using kernel AIO, with elevators using mainly
RB trees, with nr_requests = 8192.
The 60 us are for my experimental elevator, the 100 us for
anticipatory, so most of that time really is spent in the
elevator.
So, a few ms per request don't seem too unlikely for a linear
search, combined with a larger-than-default queue. (It seems
that most people trying to optimize elevator performance are
using something in the nr_requests = 1000 range.)
- Werner
--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-07-05 11:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-05 1:24 prio_tree generalization Werner Almesberger
2004-07-05 2:07 ` Andrea Arcangeli
2004-07-05 2:36 ` Werner Almesberger
2004-07-05 4:46 ` Andrew Morton
2004-07-05 11:05 ` Werner Almesberger
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.