All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Jeff King <peff@peff.net>
Cc: Johan Herland <johan@herland.net>,
	Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org
Subject: Re: [RFC 16/17] object_array_entry: copy name before storing in name field
Date: Mon, 20 May 2013 23:34:13 +0200	[thread overview]
Message-ID: <519A96D5.2070608@alum.mit.edu> (raw)
In-Reply-To: <20130520164458.GA27788@sigill.intra.peff.net>

On 05/20/2013 06:44 PM, Jeff King wrote:
> On Mon, May 20, 2013 at 04:42:38PM +0200, Michael Haggerty wrote:
> 
>>>> * Many callers store the empty string ("") as the name; for example,
>>>>   most of the entries created during a run of rev-list have "" as
>>>>   their name.  This means that lots of needless copies of "" are being
>>>>   made.  I think that the best solution to this problem would be to
>>>>   store NULL rather than "" for such entries, but I haven't figured
>>>>   out all of the places where the name is used.
>>>
>>> Use strbufs?
>>>
>>> No allocation (except for the strbuf object itself) is needed for
>>> empty strings, and string ownership and be transferred to and from it
>>> to prevent extra copies.
>>
>> That would cost two extra size_t per object_array_entry.  I have the
>> feeling that this structure is used often enough that the extra overhead
>> would be a disadvantage, but I'm not sure.
>>
>> The obvious alternative would be to teach users to deal with NULL and
>> either add another constructor alternative that transfers string
>> ownership or *always* transfer string ownership and change the callers
>> to call xstrdup() if they don't already own the name string.  I think I
>> will try that approach first.
> 
> You could use the same trick that strbuf does: instead of NULL, point to
> a well-known empty string literal. Readers do not have to care about
> this optimization at all; only writers need to recognize the well-known
> pointer value. And since we do not update in place but only eventually
> free, it really is just that anyone calling free() would do "if (name !=
> well_known_empty_string)".

Yes, that sounds like the best solution.  Ultimately there is only one
writer, add_object_array_with_mode(), and it can do

    if (!name)
        entry->name = NULL;
    else if (!*name)
        entry->name = well_known_empty_string;
    else
        entry->name = xstrdup(name);

This should be a lot less intrusive than what I was trying: to change
callers who wrote name="" to write name=NULL instead.  But it is a
nightmare to find all of the code that reads name and decide whether
they need to do

    entry->name ? entry->name : ""

because that in turn depends on whether the code that wrote into the
same object_array always/never/sometimes wrote strings vs. NULL to the
name field.  Blech, encapsulation is tough in C.

While I was chasing down callers, I came across other gems like

builtin/checkout.c:
	add_pending_object(&revs, object, sha1_to_hex(object->sha1));

revision.c:
	add_pending_object(revs, object, sha1_to_hex(object->sha1));

submodule.c:
	add_pending_object(rev, &list->item->object,
		sha1_to_hex(list->item->object.sha1));

so apparently I wasn't the only one befuddled by the lifetime and
ownership of the name field of object_array_entry.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  reply	other threads:[~2013-05-20 21:34 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-19 20:26 [PATCH 00/17] Remove assumptions about refname lifetimes Michael Haggerty
2013-05-19 20:26 ` [PATCH 01/17] describe: make own copy of refname Michael Haggerty
2013-05-19 20:26 ` [PATCH 02/17] fetch: make own copies of refnames Michael Haggerty
2013-05-19 20:26 ` [PATCH 03/17] add_rev_cmdline(): make a copy of the name argument Michael Haggerty
2013-05-19 20:26 ` [PATCH 04/17] builtin_diff_tree(): make it obvious that function wants two entries Michael Haggerty
2013-05-21 17:27   ` Junio C Hamano
2013-05-23  7:19     ` Michael Haggerty
2013-05-19 20:27 ` [PATCH 05/17] cmd_diff(): use an object_array for holding trees Michael Haggerty
2013-05-21 17:30   ` Junio C Hamano
2013-05-23  7:21     ` Michael Haggerty
2013-05-19 20:27 ` [PATCH 06/17] cmd_diff(): rename local variable "list" -> "entry" Michael Haggerty
2013-05-19 20:27 ` [PATCH 07/17] cmd_diff(): make it obvious which cases are exclusive of each other Michael Haggerty
2013-05-19 20:27 ` [PATCH 08/17] revision: split some overly-long lines Michael Haggerty
2013-05-21 17:34   ` Junio C Hamano
2013-05-23  6:27     ` Michael Haggerty
2013-05-23 17:08       ` Junio C Hamano
2013-05-19 20:27 ` [PATCH 09/17] gc_boundary(): move the check "alloc <= nr" to caller Michael Haggerty
2013-05-21 17:49   ` Junio C Hamano
2013-05-23  7:09     ` Michael Haggerty
2013-05-23 18:02       ` Junio C Hamano
2013-05-19 20:27 ` [PATCH 10/17] get_revision_internal(): make check less mysterious Michael Haggerty
2013-05-21 17:38   ` Junio C Hamano
2013-05-23  6:39     ` Michael Haggerty
2013-05-19 20:27 ` [PATCH 11/17] object_array: add function object_array_filter() Michael Haggerty
2013-05-19 20:27 ` [PATCH 12/17] object_array_remove_duplicates(): rewrite to reduce copying Michael Haggerty
2013-05-19 20:27 ` [PATCH 13/17] fsck: don't put a void*-shaped peg in a char*-shaped hole Michael Haggerty
2013-05-19 20:27 ` [PATCH 14/17] find_first_merges(): initialize merges variable using initializer Michael Haggerty
2013-05-19 20:27 ` [PATCH 15/17] find_first_merges(): remove unnecessary code Michael Haggerty
2013-05-19 20:27 ` [RFC 16/17] object_array_entry: copy name before storing in name field Michael Haggerty
2013-05-20 10:33   ` Johan Herland
2013-05-20 14:42     ` Michael Haggerty
2013-05-20 16:44       ` Jeff King
2013-05-20 21:34         ` Michael Haggerty [this message]
2013-05-19 20:27 ` [RFC 17/17] refs: document the lifetime of the refname passed to each_ref_fn Michael Haggerty
2013-05-20 10:28 ` [PATCH 00/17] Remove assumptions about refname lifetimes Johan Herland
2013-05-20 12:15   ` Michael Haggerty
2013-05-20 16:37   ` Junio C Hamano
2013-05-20 16:59     ` Jeff King
2013-05-20 17:08       ` Johan Herland
2013-05-20 18:03       ` Junio C Hamano
2013-05-20 17:03     ` Johan Herland
2013-05-21 18:39       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=519A96D5.2070608@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johan@herland.net \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.