git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, Johan Herland <johan@herland.net>,
	git@vger.kernel.org
Subject: Re: [PATCH 09/17] gc_boundary(): move the check "alloc <= nr" to caller
Date: Thu, 23 May 2013 09:09:43 +0200	[thread overview]
Message-ID: <519DC0B7.7080401@alum.mit.edu> (raw)
In-Reply-To: <7vobc4nrz8.fsf@alter.siamese.dyndns.org>

On 05/21/2013 07:49 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> There is no logical reason for this test to be here.  At the caller we
>> might be able to figure out its meaning.
>>
>> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
>> ---
> 
> I do not think this change is justified, *unless* the caller later
> in the series gains a better heuristics than what can be done with
> information in the object_array (namely, alloc and nr) to decide
> when to trigger gc.
> 
> And I was hoping to see such a cleverness added to the caller, but I
> do not think I saw any.

Correct.

> I would have to say gc_boundary() knows better when it needs to gc
> with the code at this point in the series, and that is true also in
> the final code after all the patches in this series.
> 
> If we keep the "when to gc" logic inside "gc", in 11/17 this caller
> can no longer call directly to object_array_filter().  It should
> call gc_boundary(), but I see it as a merit, not a downside.  The
> "gc function can later be taught the high/low watermark logic you
> alluded to in 10/17, and the growth/shrinkage characteristic you
> would take advantage of while doing "gc" is specific to this
> codepath.  And the logic still does not have to have access to
> anything only the caller has access to; "gc" can work on what can be
> read from the object_array->{alloc,nr} that is given to it.

I don't feel strongly about this patch and if you prefer to have it
dropped I will do so.  But let me explain my reasoning:

1. The function name gc_boundary() suggests that it will do a garbage
collection unconditionally.  In fact, it might or might not depending on
this test.  At the caller, this made it look like a gc was happening
each time through the loop, which made me nervous about the performance.
 The new version makes it clear at the caller that the gc is only
happening occasionally.

2. Even assuming that gc_boundary() were renamed to maybe_gc_boundary(),
the function has hopelessly little information on which to base its
decision whether or not to gc, and the choice of policy can only be
justified based on some implicit knowledge about how the array is grown
and shrunk.  But the growing/shrinking is done at the layer of the
caller, and therefore the choice of gc policy also belongs at the layer
of the caller.

3. As you point out, if the gc policy is ever to be made more
intelligent, then gc_boundary() is unlikely to have enough information
to implement the new policy (e.g., it would have no place to record
low/high water marks).  Separating concerns at the correct level would
make a change like that easier.

At the moment I am not interested in improving the gc policy of this
code.  The only reason that I am mucking about here is to change it to
use object_array_filter(), which is needed to centralize where
object_array_entries are created and destroyed so that the "name" memory
can be copied and freed consistently.  That can be done with or without
patches 09 and 10.  Please let me know what you prefer.

Michael

>   revision.c | 27 ++++++++++++---------------
> 
>>  1 file changed, 12 insertions(+), 15 deletions(-)
>>
>> diff --git a/revision.c b/revision.c
>> index 8ac88d6..2e0992b 100644
>> --- a/revision.c
>> +++ b/revision.c
>> @@ -2437,23 +2437,19 @@ static struct commit *get_revision_1(struct rev_info *revs)
>>  
>>  static void gc_boundary(struct object_array *array)
>>  {
>> -	unsigned nr = array->nr;
>> -	unsigned alloc = array->alloc;
>> +	unsigned nr = array->nr, i, j;
>>  	struct object_array_entry *objects = array->objects;
>>  
>> -	if (alloc <= nr) {
>> -		unsigned i, j;
>> -		for (i = j = 0; i < nr; i++) {
>> -			if (objects[i].item->flags & SHOWN)
>> -				continue;
>> -			if (i != j)
>> -				objects[j] = objects[i];
>> -			j++;
>> -		}
>> -		for (i = j; i < nr; i++)
>> -			objects[i].item = NULL;
>> -		array->nr = j;
>> +	for (i = j = 0; i < nr; i++) {
>> +		if (objects[i].item->flags & SHOWN)
>> +			continue;
>> +		if (i != j)
>> +			objects[j] = objects[i];
>> +		j++;
>>  	}
>> +	for (i = j; i < nr; i++)
>> +		objects[i].item = NULL;
>> +	array->nr = j;
>>  }
>>  
>>  static void create_boundary_commit_list(struct rev_info *revs)
>> @@ -2577,7 +2573,8 @@ static struct commit *get_revision_internal(struct rev_info *revs)
>>  		if (p->flags & (CHILD_SHOWN | SHOWN))
>>  			continue;
>>  		p->flags |= CHILD_SHOWN;
>> -		gc_boundary(&revs->boundary_commits);
>> +		if (revs->boundary_commits.alloc <= revs->boundary_commits.nr)
>> +			gc_boundary(&revs->boundary_commits);
>>  		add_object_array(p, NULL, &revs->boundary_commits);
>>  	}

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  reply	other threads:[~2013-05-23  7:09 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-19 20:26 [PATCH 00/17] Remove assumptions about refname lifetimes Michael Haggerty
2013-05-19 20:26 ` [PATCH 01/17] describe: make own copy of refname Michael Haggerty
2013-05-19 20:26 ` [PATCH 02/17] fetch: make own copies of refnames Michael Haggerty
2013-05-19 20:26 ` [PATCH 03/17] add_rev_cmdline(): make a copy of the name argument Michael Haggerty
2013-05-19 20:26 ` [PATCH 04/17] builtin_diff_tree(): make it obvious that function wants two entries Michael Haggerty
2013-05-21 17:27   ` Junio C Hamano
2013-05-23  7:19     ` Michael Haggerty
2013-05-19 20:27 ` [PATCH 05/17] cmd_diff(): use an object_array for holding trees Michael Haggerty
2013-05-21 17:30   ` Junio C Hamano
2013-05-23  7:21     ` Michael Haggerty
2013-05-19 20:27 ` [PATCH 06/17] cmd_diff(): rename local variable "list" -> "entry" Michael Haggerty
2013-05-19 20:27 ` [PATCH 07/17] cmd_diff(): make it obvious which cases are exclusive of each other Michael Haggerty
2013-05-19 20:27 ` [PATCH 08/17] revision: split some overly-long lines Michael Haggerty
2013-05-21 17:34   ` Junio C Hamano
2013-05-23  6:27     ` Michael Haggerty
2013-05-23 17:08       ` Junio C Hamano
2013-05-19 20:27 ` [PATCH 09/17] gc_boundary(): move the check "alloc <= nr" to caller Michael Haggerty
2013-05-21 17:49   ` Junio C Hamano
2013-05-23  7:09     ` Michael Haggerty [this message]
2013-05-23 18:02       ` Junio C Hamano
2013-05-19 20:27 ` [PATCH 10/17] get_revision_internal(): make check less mysterious Michael Haggerty
2013-05-21 17:38   ` Junio C Hamano
2013-05-23  6:39     ` Michael Haggerty
2013-05-19 20:27 ` [PATCH 11/17] object_array: add function object_array_filter() Michael Haggerty
2013-05-19 20:27 ` [PATCH 12/17] object_array_remove_duplicates(): rewrite to reduce copying Michael Haggerty
2013-05-19 20:27 ` [PATCH 13/17] fsck: don't put a void*-shaped peg in a char*-shaped hole Michael Haggerty
2013-05-19 20:27 ` [PATCH 14/17] find_first_merges(): initialize merges variable using initializer Michael Haggerty
2013-05-19 20:27 ` [PATCH 15/17] find_first_merges(): remove unnecessary code Michael Haggerty
2013-05-19 20:27 ` [RFC 16/17] object_array_entry: copy name before storing in name field Michael Haggerty
2013-05-20 10:33   ` Johan Herland
2013-05-20 14:42     ` Michael Haggerty
2013-05-20 16:44       ` Jeff King
2013-05-20 21:34         ` Michael Haggerty
2013-05-19 20:27 ` [RFC 17/17] refs: document the lifetime of the refname passed to each_ref_fn Michael Haggerty
2013-05-20 10:28 ` [PATCH 00/17] Remove assumptions about refname lifetimes Johan Herland
2013-05-20 12:15   ` Michael Haggerty
2013-05-20 16:37   ` Junio C Hamano
2013-05-20 16:59     ` Jeff King
2013-05-20 17:08       ` Johan Herland
2013-05-20 18:03       ` Junio C Hamano
2013-05-20 17:03     ` Johan Herland
2013-05-21 18:39       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=519DC0B7.7080401@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johan@herland.net \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).