All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Jeremiah Mahler <jmmahler@gmail.com>,
	Eric Sunshine <sunshine@sunshineco.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v2 01/19] add strbuf_set operations
Date: Wed, 11 Jun 2014 14:05:35 +0200	[thread overview]
Message-ID: <5398460F.3040900@alum.mit.edu> (raw)
In-Reply-To: <6fe33498512fc2ca1678517e51dc2e94a4260ff4.1402348696.git.jmmahler@gmail.com>

On 06/10/2014 12:19 AM, Jeremiah Mahler wrote:
> Currently, the data in a strbuf is modified using add operations.  To
> set the buffer to some data a reset must be performed before an add.
> 
>   strbuf_reset(buf);
>   strbuf_add(buf, cb.buf.buf, cb.buf.len);
> 
> And this is a common sequence of operations with 70 occurrences found in
> the current source code.  This includes all the different variations
> (add, addf, addstr, addbuf, addch).
> 
>   FILES=`find ./ -name '*.c'`
>   CNT=$(pcregrep -M "strbuf_reset.*\n.*strbuf_add" $FILES | wc -l)
>   CNT=$(echo "$CNT / 2" | bc)
>   echo $CNT
>   70
> 
> These patches add strbuf_set operations which allow this common sequence
> to be performed in one line instead of two.
> 
>   strbuf_set(buf, cb.buf.buf, cb.buf.len);
> 
> Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com>
> ---
>  Documentation/technical/api-strbuf.txt | 18 ++++++++++++++++++
>  strbuf.c                               | 21 +++++++++++++++++++++
>  strbuf.h                               | 14 ++++++++++++++
>  3 files changed, 53 insertions(+)
> 
> diff --git a/Documentation/technical/api-strbuf.txt b/Documentation/technical/api-strbuf.txt
> index 077a709..b7e23da 100644
> --- a/Documentation/technical/api-strbuf.txt
> +++ b/Documentation/technical/api-strbuf.txt
> @@ -149,6 +149,24 @@ Functions
>  	than zero if the first buffer is found, respectively, to be less than,
>  	to match, or be greater than the second buffer.
>  
> +* Setting the buffer
> +
> +`strbuf_set`::
> +
> +	Replace the buffer content with data of a given length.
> +
> +`strbuf_setstr`::
> +
> +	Replace the buffer content with data from a NUL-terminated string.
> +
> +`strbuf_setf`::
> +
> +	Replace the buffer content with a formatted string.
> +
> +`strbuf_setbuf`::
> +
> +	Replace the buffer content with data from another buffer.
> +
>  * Adding data to the buffer
>  
>  NOTE: All of the functions in this section will grow the buffer as necessary.
> diff --git a/strbuf.c b/strbuf.c
> index ac62982..9d64b00 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -189,6 +189,27 @@ void strbuf_splice(struct strbuf *sb, size_t pos, size_t len,
>  	strbuf_setlen(sb, sb->len + dlen - len);
>  }
>  
> +void strbuf_set(struct strbuf *sb, const void *data, size_t len)
> +{
> +	strbuf_reset(sb);
> +	strbuf_add(sb, data, len);
> +}
> +

I never know how much intelligence to attribute to modern compilers, but
it seems like even after optimization this function will be doing more
work than necessary:

    strbuf_reset(sb)
    -> strbuf_setlen(sb, 0)
       -> sb->len = 0
       -> sb->buf[len] = '\0'
    -> strbuf_add(sb, data, len)
       -> strbuf_grow(sb, len)
          -> ...lots of stuff...
       -> memcpy(sb->buf + sb->len, data, len)
       -> strbuf_setlen(sb, sb->len + len)
          -> sb->len = len
          -> sb->buf[len] = '\0'

If there were a function like strbuf_grow_to(sb, len):

void strbuf_grow_to(struct strbuf *sb, size_t len)
{
	int new_buf = !sb->alloc;
	if (unsigned_add_overflows(len, 1))
		die("you want to use way too much memory");
	if (new_buf)
		sb->buf = NULL;
	ALLOC_GROW(sb->buf, len + 1, sb->alloc);
	if (new_buf)
		sb->buf[0] = '\0';
}

(strbuf_grow() could call it:

static inline void strbuf_grow(struct strbuf *sb, size_t extra)
{
	if (unsigned_add_overflows(sb->len, extra))
		die("you want to use way too much memory");
	strbuf_grow_to(sb, sb->len + extra);
}

), then your function could be minimized to

void strbuf_set(struct strbuf *sb, const void *data, size_t len)
{
	strbuf_grow_to(sb, len);
	memcpy(sb->buf, data, len);
	strbuf_setlen(sb, len);
}

I think strbuf_grow_to() would be useful in other situations too.

This is just an idea; I'm not saying that you have to make this change.

> [...]

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  parent reply	other threads:[~2014-06-11 12:05 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-09 22:19 [PATCH v2 00/19] add strbuf_set operations Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 01/19] " Jeremiah Mahler
2014-06-11  8:42   ` Eric Sunshine
2014-06-12  7:10     ` Jeremiah Mahler
2014-06-11 12:05   ` Michael Haggerty [this message]
2014-06-12  7:10     ` Jeremiah Mahler
2014-06-12 10:21       ` Michael Haggerty
2014-06-12 23:59         ` Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 02/19] sha1_name: simplify via strbuf_set() Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 03/19] fast-import: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 04/19] builtin/remote: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 05/19] branch: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 06/19] builtin/branch: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 07/19] builtin/checkout: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 08/19] builtin/mailinfo: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 09/19] builtin/tag: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 10/19] date: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 11/19] diffcore-order: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 12/19] http-backend: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 13/19] http: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 14/19] ident: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 15/19] remote-curl: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 16/19] submodule: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 17/19] transport: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 18/19] vcs-svn/svndump: " Jeremiah Mahler
2014-06-09 22:19 ` [PATCH v2 19/19] wt-status: " Jeremiah Mahler
2014-06-11  8:09 ` [PATCH v2 00/19] add strbuf_set operations Eric Sunshine
2014-06-12  7:09   ` Jeremiah Mahler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5398460F.3040900@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=jmmahler@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.