git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Jeremiah Mahler <jmmahler@gmail.com>,
	Eric Sunshine <sunshine@sunshineco.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v2 01/19] add strbuf_set operations
Date: Wed, 11 Jun 2014 14:05:35 +0200	[thread overview]
Message-ID: <5398460F.3040900@alum.mit.edu> (raw)
In-Reply-To: <6fe33498512fc2ca1678517e51dc2e94a4260ff4.1402348696.git.jmmahler@gmail.com>

On 06/10/2014 12:19 AM, Jeremiah Mahler wrote:
> Currently, the data in a strbuf is modified using add operations.  To
> set the buffer to some data a reset must be performed before an add.
> 
>   strbuf_reset(buf);
>   strbuf_add(buf, cb.buf.buf, cb.buf.len);
> 
> And this is a common sequence of operations with 70 occurrences found in
> the current source code.  This includes all the different variations
> (add, addf, addstr, addbuf, addch).
> 
>   FILES=`find ./ -name '*.c'`
>   CNT=$(pcregrep -M "strbuf_reset.*\n.*strbuf_add" $FILES | wc -l)
>   CNT=$(echo "$CNT / 2" | bc)
>   echo $CNT
>   70
> 
> These patches add strbuf_set operations which allow this common sequence
> to be performed in one line instead of two.
> 
>   strbuf_set(buf, cb.buf.buf, cb.buf.len);
> 
> Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com>
> ---
>  Documentation/technical/api-strbuf.txt | 18 ++++++++++++++++++
>  strbuf.c                               | 21 +++++++++++++++++++++
>  strbuf.h                               | 14 ++++++++++++++
>  3 files changed, 53 insertions(+)
> 
> diff --git a/Documentation/technical/api-strbuf.txt b/Documentation/technical/api-strbuf.txt
> index 077a709..b7e23da 100644
> --- a/Documentation/technical/api-strbuf.txt
> +++ b/Documentation/technical/api-strbuf.txt
> @@ -149,6 +149,24 @@ Functions
>  	than zero if the first buffer is found, respectively, to be less than,
>  	to match, or be greater than the second buffer.
>  
> +* Setting the buffer
> +
> +`strbuf_set`::
> +
> +	Replace the buffer content with data of a given length.
> +
> +`strbuf_setstr`::
> +
> +	Replace the buffer content with data from a NUL-terminated string.
> +
> +`strbuf_setf`::
> +
> +	Replace the buffer content with a formatted string.
> +
> +`strbuf_setbuf`::
> +
> +	Replace the buffer content with data from another buffer.
> +
>  * Adding data to the buffer
>  
>  NOTE: All of the functions in this section will grow the buffer as necessary.
> diff --git a/strbuf.c b/strbuf.c
> index ac62982..9d64b00 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -189,6 +189,27 @@ void strbuf_splice(struct strbuf *sb, size_t pos, size_t len,
>  	strbuf_setlen(sb, sb->len + dlen - len);
>  }
>  
> +void strbuf_set(struct strbuf *sb, const void *data, size_t len)
> +{
> +	strbuf_reset(sb);
> +	strbuf_add(sb, data, len);
> +}
> +

I never know how much intelligence to attribute to modern compilers, but
it seems like even after optimization this function will be doing more
work than necessary:

    strbuf_reset(sb)
    -> strbuf_setlen(sb, 0)
       -> sb->len = 0
       -> sb->buf[len] = '\0'
    -> strbuf_add(sb, data, len)
       -> strbuf_grow(sb, len)
          -> ...lots of stuff...
       -> memcpy(sb->buf + sb->len, data, len)
       -> strbuf_setlen(sb, sb->len + len)
          -> sb->len = len
          -> sb->buf[len] = '\0'

If there were a function like strbuf_grow_to(sb, len):

void strbuf_grow_to(struct strbuf *sb, size_t len)
{
	int new_buf = !sb->alloc;
	if (unsigned_add_overflows(len, 1))
		die("you want to use way too much memory");
	if (new_buf)
		sb->buf = NULL;
	ALLOC_GROW(sb->buf, len + 1, sb->alloc);
	if (new_buf)
		sb->buf[0] = '\0';
}

(strbuf_grow() could call it:

static inline void strbuf_grow(struct strbuf *sb, size_t extra)
{
	if (unsigned_add_overflows(sb->len, extra))
		die("you want to use way too much memory");
	strbuf_grow_to(sb, sb->len + extra);
}

), then your function could be minimized to

void strbuf_set(struct strbuf *sb, const void *data, size_t len)
{
	strbuf_grow_to(sb, len);
	memcpy(sb->buf, data, len);
	strbuf_setlen(sb, len);
}

I think strbuf_grow_to() would be useful in other situations too.

This is just an idea; I'm not saying that you have to make this change.

> [...]

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  parent reply	other threads:[~2014-06-11 12:05 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-09 22:19 [PATCH v2 00/19] add strbuf_set operations Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 01/19] " Jeremiah Mahler
2014-06-11  8:42   ` Eric Sunshine
2014-06-12  7:10     ` Jeremiah Mahler
2014-06-11 12:05   ` Michael Haggerty [this message]
2014-06-12  7:10     ` Jeremiah Mahler
2014-06-12 10:21       ` Michael Haggerty
2014-06-12 23:59         ` Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 02/19] sha1_name: simplify via strbuf_set() Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 03/19] fast-import: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 04/19] builtin/remote: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 05/19] branch: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 06/19] builtin/branch: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 07/19] builtin/checkout: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 08/19] builtin/mailinfo: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 09/19] builtin/tag: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 10/19] date: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 11/19] diffcore-order: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 12/19] http-backend: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 13/19] http: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 14/19] ident: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 15/19] remote-curl: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 16/19] submodule: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 17/19] transport: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 18/19] vcs-svn/svndump: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 19/19] wt-status: " Jeremiah Mahler
2014-06-11  8:09 ` [PATCH v2 00/19] add strbuf_set operations Eric Sunshine
2014-06-12  7:09   ` Jeremiah Mahler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5398460F.3040900@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=jmmahler@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).