From: David Turner <dturner@twopensource.com>
To: Jeff King <peff@peff.net>
Cc: Remi Galan Alfonso <remi.galan-alfonso@ensimag.grenoble-inp.fr>,
William Duclot <william.duclot@ensimag.grenoble-inp.fr>,
git@vger.kernel.org,
simon rabourg <simon.rabourg@ensimag.grenoble-inp.fr>,
francois beutin <francois.beutin@ensimag.grenoble-inp.fr>,
antoine queru <antoine.queru@ensimag.grenoble-inp.fr>,
matthieu moy <matthieu.moy@grenoble-inp.fr>,
mhagger@alum.mit.edu
Subject: Re: [PATCH 0/2] strbuf: improve API
Date: Wed, 01 Jun 2016 16:22:22 -0400 [thread overview]
Message-ID: <1464812542.3988.12.camel@twopensource.com> (raw)
In-Reply-To: <20160601200959.GA13061@sigill.intra.peff.net>
On Wed, 2016-06-01 at 16:09 -0400, Jeff King wrote:
> On Wed, Jun 01, 2016 at 03:50:29PM -0400, David Turner wrote:
>
> > On Wed, 2016-06-01 at 03:42 -0400, Jeff King wrote:
> > > 2. Do caching tricks for strbufs used in tight loops. For
> > > example,
> > > have strbuf_release() throw its buffer into a last-used
> > > cache,
> > > and
> > > let the next strbuf_grow() use that cache entry. This cuts
> > > malloc()
> > > out of the loop.
> > >
> > > You'd probably want to protect the cache with a mutex,
> > > though.
> >
> >
> > ... or make the last-used cache be thread-local.
>
> Good idea.
>
> I almost didn't mention threading at all, because I'd be surprised if
> malloc lock contention is a serious bottleneck for us anywhere.
>
> It's hard to really say much concrete because it's not clear to me
> where
> people think strbuf allocation _is_ a bottleneck (or again, _would
> be_
> if we were to use it more).
>
> If we imagine a loop like this:
>
> for (i = 0; i < n; i++) {
> struct strbuf buf = STRBUF_INIT;
> do_something_with(&buf);
> strbuf_release(&buf);
> }
>
> then yes, we'll call malloc() and free() a lot of times. But unless
> your
> libc malloc is terrible, those calls are all probably just taking a
> mutex and reusing the same block from a free list. The strbuf case is
> actually really friendly to allocators because our blocks tend to be
> predictable sizes (they're sized by ALLOC_GROW, which has a
> deterministic set of block sizes).
>
> In practice, I'd imagine loops get more complicated than that. They
> call
> sub-functions which allocate a strbuf, and sometimes don't release it
> until other strbufs have been allocated, etc. The proposal in this
> thread is basically using the call stack to mirror the allocation
> patterns. So when the pattern of allocations matches an actual stack,
> that's good, but I'd expect a reasonably smart allocator to perform
> similarly. And when it's not a true stack, the stack-based strbufs
> are
> going to end up with wasted stack space hanging around even after
> we've
> free()'d the memory and could reuse it. And I'd still expect an
> allocator based off a linked list or something to handle such cases
> pretty well.
>
> There are also other non-big-O factors at play in modern systems,
> like
> CPU cache footprints. Is heap memory cheaper or more expensive to
> access
> than stack? I can imagine stack is kept in the cache better, but if
> you're reusing the same heap block over and over, it probably stays
> in
> cache, too. But if you have unused stack buffers that _could_ be
> reused
> (but won't be because you'd have to manually feed them to a new
> strbuf),
> that hurts your footprint. And on top of that, the stack proposals
> I've
> seen generally involve over-sizing the stack buffers out of a fear of
> actually calling malloc.
>
> And then on top of that, there is the question of whether anything
> involving a strbuf allocation is actually a tight enough loop to even
> _care_ about cache dynamics.
>
> So again, I'm slightly negative on anything that makes the code even
> slightly more complex, especially the call sites, if we can't show
> actual measurable improvement numbers.
>
> Sorry, this turned into a mini-rant, and it's not really directed at
> you, David. Your email was just what spurred me to put some of these
> thoughts into words. :)
I agree with all this stuff anyway.
next prev parent reply other threads:[~2016-06-01 20:22 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-30 10:36 [PATCH 0/2] strbuf: improve API William Duclot
2016-05-30 10:36 ` [PATCH 1/2] strbuf: add tests William Duclot
2016-05-30 11:26 ` Johannes Schindelin
2016-05-30 13:42 ` Simon Rabourg
2016-05-30 11:56 ` Matthieu Moy
2016-05-31 2:04 ` Michael Haggerty
2016-05-31 9:48 ` Simon Rabourg
2016-05-30 10:36 ` [PATCH 2/2] strbuf: allow to use preallocated memory William Duclot
2016-05-30 12:13 ` Johannes Schindelin
2016-05-30 13:20 ` William Duclot
2016-05-31 6:21 ` Johannes Schindelin
2016-05-31 3:05 ` Michael Haggerty
2016-05-31 6:41 ` Johannes Schindelin
2016-05-31 8:25 ` Michael Haggerty
2016-05-30 12:52 ` Matthieu Moy
2016-05-30 14:15 ` William Duclot
2016-05-30 14:34 ` Matthieu Moy
2016-05-30 15:16 ` William Duclot
2016-05-31 4:05 ` Michael Haggerty
2016-05-31 15:59 ` William Duclot
2016-06-03 14:04 ` William Duclot
2016-05-30 21:56 ` Mike Hommey
2016-05-30 22:46 ` William Duclot
2016-05-30 22:50 ` Mike Hommey
2016-05-31 6:34 ` Junio C Hamano
2016-05-31 15:45 ` William
2016-05-31 15:54 ` Matthieu Moy
2016-05-31 16:08 ` William Duclot
2016-05-30 11:32 ` [PATCH 0/2] strbuf: improve API Remi Galan Alfonso
2016-06-01 7:42 ` Jeff King
2016-06-01 19:50 ` David Turner
2016-06-01 20:09 ` Jeff King
2016-06-01 20:22 ` David Turner [this message]
2016-06-01 21:07 ` Jeff King
2016-06-02 11:11 ` Michael Haggerty
2016-06-02 12:58 ` Matthieu Moy
2016-06-02 14:22 ` William Duclot
2016-06-24 17:20 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1464812542.3988.12.camel@twopensource.com \
--to=dturner@twopensource.com \
--cc=antoine.queru@ensimag.grenoble-inp.fr \
--cc=francois.beutin@ensimag.grenoble-inp.fr \
--cc=git@vger.kernel.org \
--cc=matthieu.moy@grenoble-inp.fr \
--cc=mhagger@alum.mit.edu \
--cc=peff@peff.net \
--cc=remi.galan-alfonso@ensimag.grenoble-inp.fr \
--cc=simon.rabourg@ensimag.grenoble-inp.fr \
--cc=william.duclot@ensimag.grenoble-inp.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).