git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "阿德烈 via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, "Jeff King" <peff@peff.net>,
	阿德烈 <adlternative@gmail.com>
Subject: Re: [PATCH] strbuf.c: optimize program logic
Date: Mon, 25 Jan 2021 22:17:41 -0800	[thread overview]
Message-ID: <xmqqy2gg2pdm.fsf@gitster.c.googlers.com> (raw)
In-Reply-To: <pull.846.git.1611637582625.gitgitgadget@gmail.com> ("阿德烈 via GitGitGadget"'s message of "Tue, 26 Jan 2021 05:06:22 +0000")

"阿德烈 via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: ZheNing Hu <adlternative@gmail.com>
>
> the usage in strbuf.h tell us"Alloc is somehow a
> "private" member that should not be messed with.
> use `strbuf_avail()`instead."

When we use the word "private", it generally means it is private to
the implementation of the API.  IOW, it is usually fine for the
implementation of the API (i.e. for strbuf API, what you see in
strbuf.c) to use private members.

In any case, these changes are _not_ optimizations.  

Replacing (alloc - len - 1) with strbuf_avail() is at best an
equivalent rewrite (which is a good thing from readability's point
of view, but not an optimization).  We know sb->alloc during the
loop is never 0, but the compiler may miss the fact, so the inlined
implementation of _avail, i.e.

	static inline size_t strbuf_avail(const struct strbuf *sb)
	{
	        return sb->alloc ? sb->alloc - sb->len - 1 : 0;
        }

may not incur call overhead, but may be pessimizing the executed
code.

If you compare the code in the loop in the second hunk below with
what _setlen() does, I think you'll see the overhead of _setlen()
relative to the original code is even higher, so it may also be
pessimizing, not optimizing.

So, overall, I am not all that enthused to see this patch.


One thing I noticed is that, whether open coded like sb->len += got
or made into parameter to strbuf_setlen(sb, sb->len + got), we are
not careful about sb->len growing too large and overflowing with the
addition.  That may potentially be an interesting thing to look
into, but at the same time, unlike the usual "compute the number of
bytes we need to allocate and then call xmalloc()" pattern, where we
try to be careful in the "compute" step by using st_add() macros,
this code actually keep growing the buffer, so by the time the size_t
overflows and wraps around, we'd certainly have exhausted the memory
already, so it won't be an issue.

But we may want to audit existing code that is not careful when
preparing the second parameter to strbuf_setlen().  We just
analyzed, if we were to accept this patch, that "sb->len + got" that
appear as the second parameter to new call of strbuf_setlen() looks
bad but would not matter in practice, but we may not be so lucky in
other places.

Thanks for a food for thought.

> diff --git a/strbuf.c b/strbuf.c
> index e3397cc4c72..76f560a28d0 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -517,7 +517,7 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
>  
>  	strbuf_grow(sb, hint ? hint : 8192);
>  	for (;;) {
> -		ssize_t want = sb->alloc - sb->len - 1;
> +		ssize_t want = strbuf_avail(sb);
>  		ssize_t got = read_in_full(fd, sb->buf + sb->len, want);
>  
>  		if (got < 0) {
> @@ -527,7 +527,7 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
>  				strbuf_setlen(sb, oldlen);
>  			return -1;
>  		}
> -		sb->len += got;
> +		strbuf_setlen(sb, sb->len + got);
>  		if (got < want)
>  			break;
>  		strbuf_grow(sb, 8192);
> @@ -543,7 +543,7 @@ ssize_t strbuf_read_once(struct strbuf *sb, int fd, size_t hint)
>  	ssize_t cnt;
>  
>  	strbuf_grow(sb, hint ? hint : 8192);
> -	cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
> +	cnt = xread(fd, sb->buf + sb->len, strbuf_avail(sb));
>  	if (cnt > 0)
>  		strbuf_setlen(sb, sb->len + cnt);
>  	else if (oldalloc == 0)
>
> base-commit: 6d3ef5b467eccd2769f1aa1c555d317d3c8dc707

  reply	other threads:[~2021-01-26  6:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-26  5:06 [PATCH] strbuf.c: optimize program logic 阿德烈 via GitGitGadget
2021-01-26  6:17 ` Junio C Hamano [this message]
2021-01-26 10:44   ` 胡哲宁
2021-01-26 18:47     ` Junio C Hamano
2021-01-26 18:23   ` Jeff King
2021-01-26 20:15     ` Junio C Hamano
2021-01-29  6:09     ` 胡哲宁
2021-01-30  8:50       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqy2gg2pdm.fsf@gitster.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=adlternative@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).