git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: William Duclot <william.duclot@ensimag.grenoble-inp.fr>,
	git@vger.kernel.org, simon.rabourg@ensimag.grenoble-inp.fr,
	francois.beutin@ensimag.grenoble-inp.fr,
	antoine.queru@ensimag.grenoble-inp.fr,
	matthieu.moy@grenoble-inp.fr
Subject: Re: [PATCH 2/2] strbuf: allow to use preallocated memory
Date: Tue, 31 May 2016 10:25:29 +0200	[thread overview]
Message-ID: <574D4A79.1030405@alum.mit.edu> (raw)
In-Reply-To: <alpine.DEB.2.20.1605310822490.4449@virtualbox>

On 05/31/2016 08:41 AM, Johannes Schindelin wrote:
> Hi Michael,
> 
> On Tue, 31 May 2016, Michael Haggerty wrote:
> 
>> On 05/30/2016 02:13 PM, Johannes Schindelin wrote:
>>> [...]
>>>> @@ -38,7 +67,11 @@ char *strbuf_detach(struct strbuf *sb, size_t *sz)
>>>>  {
>>>>  	char *res;
>>>>  	strbuf_grow(sb, 0);
>>>> -	res = sb->buf;
>>>> +	if (sb->flags & STRBUF_OWNS_MEMORY)
>>>> +		res = sb->buf;
>>>> +	else
>>>> +		res = xmemdupz(sb->buf, sb->alloc - 1);
>>>
>>> This looks like a usage to be avoided: if we plan to detach the buffer,
>>> anyway, there is no good reason to allocate it on the heap first. I would
>>> at least issue a warning here.
>>
>> I think this last case should be changed to
>>
>>     res = xmemdupz(sb->buf, sb->len);
>>
>> Johannes, if this change is made then I think that there is a reasonable
>> use case for calling `strbuf_detach()` on a strbuf that wraps a
>> stack-allocated string, so I don't think that a warning is needed.
>>
>> I think this change makes sense. After all, once a caller detaches a
>> string, it is probably not planning on growing/shrinking it anymore, so
>> any more space than that would probably be wasted. In fact, since the
>> caller has no way to ask how much extra space the detached string has
>> allocated, it is almost guaranteed that the space would be wasted.
> 
> Ah, I did not think of that. (We could of course introduce
> strbuf_realloc_snugly() or some such, but I agree: why?) So the best
> counter-example to my objection might be something like this:
> 
> -- snip --
> Subject: git_pathdup(): avoid realloc()
> 
> When generating Git paths, we already know the maximum length:
> MAX_PATH. Let's avoid realloc()-caused fragmentation by using a
> fixed buffer.
> 
> As a side effect, we avoid overallocating the end result, too:
> previously, strbuf_detach() would not realloc() to avoid wasting
> the (sb->alloc - sb->len) bytes, while it malloc()s the precisely
> needed amount of memory when detaching fixed buffers.
> 
> diff --git a/path.c b/path.c
> index 2511d8a..64fd3ee 100644
> --- a/path.c
> +++ b/path.c
> @@ -426,7 +426,8 @@ const char *git_path(const char *fmt, ...)
>  
>  char *git_pathdup(const char *fmt, ...)
>  {
> -	struct strbuf path = STRBUF_INIT;
> +	char buffer[MAX_PATH];
> +	struct strbuf path = STRBUF_WRAP_FIXED(buffer);
>  	va_list args;
>  	va_start(args, fmt);
>  	do_git_path(&path, fmt, args);
> -- snap --

I like the idea, but apparently MAX_PATH is not a hard limit that OSs
must respect (e.g., see [1]). The existing implementation can generate
paths longer than MAX_PATH whereas your replacement cannot.

This would be easy to solve by using STRBUF_WRAP_MOVABLE() instead of
STRBUF_WRAP_FIXED() in your patch. We would thereby accept the
possibility that the strbuf might need to be expanded for really long
paths. However, if that happens, then strbuf_detach() is likely to
return the string in an overallocated memory area.

So if there are callers who *really* care about not having overallocated
memory, we might want a strbuf_detach_snugly().

Your (only half-serious) idea for strbuf_realloc_snugly() would have the
disadvantage that it couldn't work with a fixed strbuf, whereas
strbuf_detach_snugly() could. It would be sad for callers to have to
worry about the distinction.

Michael

[1] http://insanecoding.blogspot.de/2007/11/pathmax-simply-isnt.html

  reply	other threads:[~2016-05-31  8:26 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-30 10:36 [PATCH 0/2] strbuf: improve API William Duclot
2016-05-30 10:36 ` [PATCH 1/2] strbuf: add tests William Duclot
2016-05-30 11:26   ` Johannes Schindelin
2016-05-30 13:42     ` Simon Rabourg
2016-05-30 11:56   ` Matthieu Moy
2016-05-31  2:04   ` Michael Haggerty
2016-05-31  9:48     ` Simon Rabourg
2016-05-30 10:36 ` [PATCH 2/2] strbuf: allow to use preallocated memory William Duclot
2016-05-30 12:13   ` Johannes Schindelin
2016-05-30 13:20     ` William Duclot
2016-05-31  6:21       ` Johannes Schindelin
2016-05-31  3:05     ` Michael Haggerty
2016-05-31  6:41       ` Johannes Schindelin
2016-05-31  8:25         ` Michael Haggerty [this message]
2016-05-30 12:52   ` Matthieu Moy
2016-05-30 14:15     ` William Duclot
2016-05-30 14:34       ` Matthieu Moy
2016-05-30 15:16         ` William Duclot
2016-05-31  4:05     ` Michael Haggerty
2016-05-31 15:59       ` William Duclot
2016-06-03 14:04       ` William Duclot
2016-05-30 21:56   ` Mike Hommey
2016-05-30 22:46     ` William Duclot
2016-05-30 22:50       ` Mike Hommey
2016-05-31  6:34   ` Junio C Hamano
2016-05-31 15:45     ` William
2016-05-31 15:54       ` Matthieu Moy
2016-05-31 16:08         ` William Duclot
2016-05-30 11:32 ` [PATCH 0/2] strbuf: improve API Remi Galan Alfonso
2016-06-01  7:42   ` Jeff King
2016-06-01 19:50     ` David Turner
2016-06-01 20:09       ` Jeff King
2016-06-01 20:22         ` David Turner
2016-06-01 21:07     ` Jeff King
2016-06-02 11:11       ` Michael Haggerty
2016-06-02 12:58         ` Matthieu Moy
2016-06-02 14:22           ` William Duclot
2016-06-24 17:20         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=574D4A79.1030405@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=antoine.queru@ensimag.grenoble-inp.fr \
    --cc=francois.beutin@ensimag.grenoble-inp.fr \
    --cc=git@vger.kernel.org \
    --cc=matthieu.moy@grenoble-inp.fr \
    --cc=simon.rabourg@ensimag.grenoble-inp.fr \
    --cc=william.duclot@ensimag.grenoble-inp.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).