git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Eric Sunshine <sunshine@sunshineco.com>
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"David Turner" <dturner@twopensource.com>,
	"Jeff King" <peff@peff.net>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Git List" <git@vger.kernel.org>
Subject: Re: [PATCH 09/13] refs: introduce an iterator interface
Date: Tue, 31 May 2016 09:59:04 +0200	[thread overview]
Message-ID: <574D4448.5020004@alum.mit.edu> (raw)
In-Reply-To: <CAPig+cSjzZGUjdgkz1y7brGNb1M2gHfW0UG-wgBc00beNDQmnA@mail.gmail.com>

On 05/31/2016 07:29 AM, Eric Sunshine wrote:
> On Mon, May 30, 2016 at 3:55 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>> [...]
> [...]
> Either:
> 
>     s/false/something other than ITER_OK/
> 
> or:
> 
>     s/false/ITER_DONE or ITER_ERROR/

Thanks.

>> +int ref_iterator_advance(struct ref_iterator *ref_iterator);
>> +
>> +/*
>> + * An iterator over nothing (its first ref_iterator_advance() call
>> + * returns 0).
>> + */
> 
> s/0/ITER_DONE/

Thanks. I guess you can guess what an earlier draft of this interface
looked like :-)

>> +struct ref_iterator *empty_ref_iterator_begin(void);
>> +
>> +/*
>> + * Return true iff ref_iterator is an empty_ref_iterator.
>> + */
>> +int is_empty_ref_iterator(struct ref_iterator *ref_iterator);
> 
> I can see that you used this function as an optimization or
> convenience in overlay_ref_iterator_begin(), but do you expect it to
> be generally useful otherwise? Is it worth publishing? Do you have
> other use-cases in mind?

It is only "published" within the refs module, in refs/refs-internal.h.
This header file is not meant to be used by code outside of the refs module.

My thinking was that it might be useful to other reference backends. The
function is pretty safe for anybody to call, though I admit that it is
not very general.

I don't have a strong feeling either way. If nobody else chimes in, I'll
remove it from the header file as you suggested. We can always add it
back if somebody needs it.

> Also, can you explain why the merge iterator doesn't also perform the
> optimization/convenience of checking if one iterator is an empty
> iterator?

That's because the merge iterator doesn't know what its select function
will do. For example, you could imagine an "intersect" select function
that only lets through references that were in *both* sub-iterators. In
that case, your suggested "optimization" would be incorrect.

Incidentally, that's also why I decided to leave the select function in
charge even after one or both of the sub-iterators is exhausted—because
it lets merge_ref_iterator implement more diverse behavior.

>> +/*
>> + * Iterate over the intries from iter0 and iter1, with the values
> 
> s/intries/entries/

Thanks.

>> + * interleaved as directed by the select function. The iterator takes
>> + * ownership of iter0 and iter1 and frees them when the iteration is
>> + * over.
>> + */
>> +struct ref_iterator *merge_ref_iterator_begin(
>> +               struct ref_iterator *iter0, struct ref_iterator *iter1,
>> +               ref_iterator_select_fn *select, void *cb_data);
>> +
>> +/*
>> + * An iterator consisting of the union of the entries from iter0 and
>> + * iter1. If there are entries common to the two sub-iterators, use
>> + * the one from iter1. Each iterator must iterate over its entries in
>> + * strcmp() order by refname for this to work.
>> + *
>> + * The new iterator takes ownership of its arguments and frees them
>> + * when the iteration is over. As a convenience to callers, if iter0
>> + * or iter1 is_empty_ref_iterator(), then abort that one immediately
>> + * and return the other iterator directly, without wrapping it.
>> + */
>> +struct ref_iterator *overlay_ref_iterator_begin(struct ref_iterator *iter0,
>> +                                               struct ref_iterator *iter1);
> 
> When reading about the overlay iterator (both code and documentation),
> my expectation was that iter0 would shadow iter1, not the other way
> around as implemented here. Of course, that's entirely subjective, but
> the generic names don't provide any useful clues as to which shadows
> which. Perhaps giving them more meaningful names would help.

That's a good idea. I also found myself having to refer back to the
documentation to remind myself which was which.

How about I rename them "back" and "front"? I will also reverse the
order of the arguments.

(But I will leave the names "iter0" and "iter1" in merge_ref_iterator,
and also the constants like ITER_SELECT_0, because these don't
necessarily have the interpretation of "back" and "front".)

>> +/*
>> + * Wrap iter0, only letting through the references whose names start
>> + * with prefix. If trim is set, set iter->refname to the name of the
>> + * reference with that many characters trimmed off the front;
>> + * otherwise set it to the full refname. The new iterator takes over
>> + * ownership of iter0 and frees it when iteration is over. It makes
>> + * its own copy of prefix.
>> + *
>> + * As an convenience to callers, if prefix is the empty string and
>> + * trim is zero, this function returns iter0 directly, without
>> + * wrapping it.
>> + */
>> +struct ref_iterator *prefix_ref_iterator_begin(struct ref_iterator *iter0,
>> +                                              const char *prefix,
>> +                                              int trim);
> 
> Minor: Similarly, when reading the code and documentation, I wondered
> why this was named 'iter0' when no 'iter1' was in sight. Perhaps name
> it simply 'iter'.

I found that it got a little bit confusing, because the constructor and
method implementations all use `iter` as a local variable. In particular
in the constructor there would want to be an argument "iter" and also
the local variable "iter" for the iterator being constructed, so a new
name would otherwise have to be invented for one or the other. Between
all the "iter" and "iter" and "iter->iter", I found that naming the
sub-iterator "iter0" made things a little bit less bewildering.

If you don't like that, we could name the embedded iterators something
like "subiter", "subiter0", and "subiter1". But the current convention
is a bit more succinct so I slightly prefer it.

Thanks for all your comments!

Michael

  reply	other threads:[~2016-05-31  7:59 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-30  7:55 [PATCH 00/13] Reference iterators Michael Haggerty
2016-05-30  7:55 ` [PATCH 01/13] refs: remove unnecessary "extern" keywords Michael Haggerty
2016-05-30  7:55 ` [PATCH 02/13] do_for_each_ref(): move docstring to the header file Michael Haggerty
2016-05-30  7:55 ` [PATCH 03/13] refs: use name "prefix" consistently Michael Haggerty
2016-05-30  7:55 ` [PATCH 04/13] delete_refs(): add a flags argument Michael Haggerty
2016-05-30  7:55 ` [PATCH 05/13] remote rm: handle symbolic refs correctly Michael Haggerty
2016-05-30  7:55 ` [PATCH 06/13] get_ref_cache(): only create an instance if there is a submodule Michael Haggerty
2016-05-30  7:55 ` [PATCH 07/13] entry_resolves_to_object(): rename function from ref_resolves_to_object() Michael Haggerty
2016-05-30  7:55 ` [PATCH 08/13] ref_resolves_to_object(): new function Michael Haggerty
2016-05-30  7:55 ` [PATCH 09/13] refs: introduce an iterator interface Michael Haggerty
2016-05-30 15:22   ` Ramsay Jones
2016-05-30 16:57     ` Ramsay Jones
2016-05-31  1:16       ` Michael Haggerty
2016-05-31  5:29   ` Eric Sunshine
2016-05-31  7:59     ` Michael Haggerty [this message]
2016-05-31 23:12       ` Eric Sunshine
2016-06-03 11:48         ` Michael Haggerty
2016-05-31  6:10   ` Junio C Hamano
2016-05-31  8:45     ` Michael Haggerty
2016-06-02 10:08   ` Duy Nguyen
2016-06-02 16:23     ` Michael Haggerty
2016-05-30  7:55 ` [PATCH 10/13] do_for_each_ref(): reimplement using reference iteration Michael Haggerty
2016-05-30  7:55 ` [PATCH 11/13] for_each_reflog(): don't abort for bad references Michael Haggerty
2016-05-30  7:55 ` [PATCH 12/13] dir_iterator: new API for iterating over a directory tree Michael Haggerty
2016-06-01  0:12   ` David Turner
2016-06-03 11:57     ` Michael Haggerty
2016-05-30  7:55 ` [PATCH 13/13] for_each_reflog(): reimplement using iterators Michael Haggerty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=574D4448.5020004@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).