git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [rfh] do I need to use something more complex to do this?
@ 2013-04-10 14:40 Junio C Hamano
  2013-04-10 15:23 ` Andreas Ericsson
  2013-04-10 18:19 ` Junio C Hamano
  0 siblings, 2 replies; 3+ messages in thread
From: Junio C Hamano @ 2013-04-10 14:40 UTC (permalink / raw)
  To: git

I have set of items with two attributes, <X,Y>, and would like to
keep them in some data structure in such a way that it is efficient
to (1) add a new item to the data structure, and (2) pick an item in
a specific order. There can be multiple items that share the same
value for X, or Y, or both X and Y, and it does not matter in what
order items comes out among those that share the same <X,Y>.

The type of X is totally ordered. The type of Y also usually is, but
Y can take a special value U(nspecified).

Now on to the "specific" order I want to pick an item.  I'd like to
take the item with the largest value of Y in general, and tiebreaking
on the value of X which also I prefer to take from larger to smaller.

But with a twist.

When I am picking an item <X=n,Y=m>, there should be no item
remaining in the data store with a value of Y that is smaller than m
(duplicates are allowed, so there can still be items with Y=m), and
also when I am picking <X=n,Y=m>, there should be no item with
Y=Unspecified that has a value of X that is equal or smaller than n.

E.g. if I have these 6 items (ignore the lines between the items for
now):

            <104,U>--<105,U>--<106,4>
           /
    <101,U>--<100,U>--<102,3>--<104,4>

I would want to pick them up in this order:

    <106,4> <105,U> <104,U> <104,4> <102,3> <101,U> <100,U>

I see how this can easily be done by using a two priority lists,
i.e. one for items with Y=Unspecified that is sorted by X, and the
other for all other items that is sorted by <Y,X>. Peek the top of
both, and pick the top of the former until its X is smaller than the
value of X of the top of the latter, otherwise pick the top of the
latter.  I am wondering if I can use less complex data structure,
like a single ordered sorted array, with a clever comparison
function.

For the curious, the items in the above picture represents commits,
and lines are ancestry chains between them. I am thinking how we can
extend the still_interesting() function with an optional generation
number.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [rfh] do I need to use something more complex to do this?
  2013-04-10 14:40 [rfh] do I need to use something more complex to do this? Junio C Hamano
@ 2013-04-10 15:23 ` Andreas Ericsson
  2013-04-10 18:19 ` Junio C Hamano
  1 sibling, 0 replies; 3+ messages in thread
From: Andreas Ericsson @ 2013-04-10 15:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 04/10/2013 04:40 PM, Junio C Hamano wrote:
> I have set of items with two attributes, <X,Y>, and would like to
> keep them in some data structure in such a way that it is efficient
> to (1) add a new item to the data structure, and (2) pick an item in
> a specific order. There can be multiple items that share the same
> value for X, or Y, or both X and Y, and it does not matter in what
> order items comes out among those that share the same <X,Y>.
>
> The type of X is totally ordered. The type of Y also usually is, but
> Y can take a special value U(nspecified).
>
> Now on to the "specific" order I want to pick an item.  I'd like to
> take the item with the largest value of Y in general, and tiebreaking
> on the value of X which also I prefer to take from larger to smaller.
>
> But with a twist.
>
> When I am picking an item <X=n,Y=m>, there should be no item
> remaining in the data store with a value of Y that is smaller than m
> (duplicates are allowed, so there can still be items with Y=m), and
> also when I am picking <X=n,Y=m>, there should be no item with
> Y=Unspecified that has a value of X that is equal or smaller than n.
>

So X is primary sort and Y is secondary, except Y=Undefined trumps all
other values for Y, but never trumps X as primary sort.

Can't you just have U be the largest unsigned integer value of the
type you choose? For this particular application, I doubt there's any
risk of the defined numbers catching up with it.

I might have missed something though. This seems a bit too trivial for
you to ask for help.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [rfh] do I need to use something more complex to do this?
  2013-04-10 14:40 [rfh] do I need to use something more complex to do this? Junio C Hamano
  2013-04-10 15:23 ` Andreas Ericsson
@ 2013-04-10 18:19 ` Junio C Hamano
  1 sibling, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2013-04-10 18:19 UTC (permalink / raw)
  To: git

Junio C Hamano <gitster@pobox.com> writes:

> I have set of items with two attributes, <X,Y>, and would like to
> keep them in some data structure in such a way that it is efficient
> to (1) add a new item to the data structure, and (2) pick an item in
> a specific order. There can be multiple items that share the same
> value for X, or Y, or both X and Y, and it does not matter in what
> order items comes out among those that share the same <X,Y>.
>
> The type of X is totally ordered. The type of Y also usually is, but
> Y can take a special value U(nspecified).
>
> Now on to the "specific" order I want to pick an item.  I'd like to
> take the item with the largest value of Y in general, and tiebreaking
> on the value of X which also I prefer to take from larger to smaller.
>
> But with a twist.
>
> When I am picking an item <X=n,Y=m>, there should be no item
> remaining in the data store with a value of Y that is smaller than m
> (duplicates are allowed, so there can still be items with Y=m), and
> also when I am picking <X=n,Y=m>, there should be no item with
> Y=Unspecified that has a value of X that is equal or smaller than n.
>
> E.g. if I have these 6 items (ignore the lines between the items for
> now):
>
>             <104,U>--<105,U>--<106,4>
>            /
>     <101,U>--<100,U>--<102,3>--<104,4>
>
> I would want to pick them up in this order:
>
>     <106,4> <105,U> <104,U> <104,4> <102,3> <101,U> <100,U>

Note that with the above specification, a possible solution is to
show all the items with Y=Unspecified before showing others, but
that would not be ideal for the intended use case; pretending Y=U as
if Y=max_range is not a usable workaround.

This is "I create a stream of items with specified Y in descending
order.  There are some items with Y=Unspecified and I want to find
appropriate places to mix the latter into that stream".

Because the desired ordering is not a total order, I need to go
to the "pair of priority list" route, I think.
>
> I see how this can easily be done by using a two priority lists,
> i.e. one for items with Y=Unspecified that is sorted by X, and the
> other for all other items that is sorted by <Y,X>. Peek the top of
> both, and pick the top of the former until its X is smaller than the
> value of X of the top of the latter, otherwise pick the top of the
> latter.  I am wondering if I can use less complex data structure,
> like a single ordered sorted array, with a clever comparison
> function.
>
> For the curious, the items in the above picture represents commits,
> and lines are ancestry chains between them. I am thinking how we can
> extend the still_interesting() function with an optional generation
> number.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-04-10 18:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-10 14:40 [rfh] do I need to use something more complex to do this? Junio C Hamano
2013-04-10 15:23 ` Andreas Ericsson
2013-04-10 18:19 ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).