git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: "René Scharfe" <l.s.r@web.de>
Cc: Junio C Hamano <gitster@pobox.com>,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	Jessica Clarke <jrtc27@jrtc27.com>, Taylor Blau <me@ttaylorr.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH] Properly align memory allocations and temporary buffers
Date: Fri, 7 Jan 2022 16:18:10 -0800	[thread overview]
Message-ID: <CABPp-BG3rwC2fy-1fu0GwzofUjaF2HbJOC79BNBHxSoVS-vNAQ@mail.gmail.com> (raw)
In-Reply-To: <f40c1b47-9aad-2dcc-ceeb-5dee2b517cd8@web.de>

On Fri, Jan 7, 2022 at 3:30 PM René Scharfe <l.s.r@web.de> wrote:
>
...
> --- >8 ---
> Subject: [PATCH] stable-qsort: avoid using potentially unaligned access
>
> Like in the previous patch for compat/qsort_s.c, remove the optimization
> of using an on-stack buffer to avoid small allocations.  This ensures
> maximum alignment for the array elements and simplifies the code a bit.
>
> The performance impact for the current callers is unlikely to be
> noticeable:
>
>  * compat/mingw.c::make_environment_block() uses ALLOC_ARRAY and
>    ALLOC_GROW several times already, so another allocation of up to 1KB
>    should not matter much.

Not familiar with this code, but that makes sense to me.

>  * diffcore-rename.c::diffcore_rename_extended() is called once per diff
>    or twice per merge, and those require allocations for each object and
>    more already.

Actually, this sort could be skipped entirely if it can determine all
renames early (either because they are found via exact matches, or
they aren't relevant for the merge result, or they are found via
basename comparisons).  This sort is only called when we have to
resort to the full quadratic pairwise comparisons between files, in
which case we're also slurping in full files, doing the diffcore-delta
work to convert each file's contents into a spanhash, and then
pairwise comparing spanhashes for every pairing of source &
destination files that remain.  That work would absolutely dwarf the
malloc of a kilobyte.  So I agree, it's not worth worrying about this
one.

>  * merge-ort.c::detect_and_process_renames() is called once per merge.
>    It's responsible for the two per-merge diffcore_rename_extended()
>    calls mentioned above as well, though.  So this is possibly the most
>    impacted caller.  Per-object allocations are likely to dwarf the
>    additional small allocations in git_stable_qsort(), though.

The particular sort call directly found in
detect_and_process_renames() was once nearly 30% of overall execution
time in merge-ort[1], but according to some local notes I kept, it
eventually dropped to about ~1% of overall execution time after
trivial directory resolution[2] (due to the fact that it could just
include some higher level directories and omit all the files
underneath them -- i.e. it had far fewer paths to sort).

Since I suspect your change here would generally just be a small
percentage of the overall git_stable_qsort() time (if it can even be
measured), and git_stable_qsort() time is a small percentage of
merge-ort runtime, I think any runtime differences here would be
negligible.

[1] https://lore.kernel.org/git/140c1e89e0ec69c5c5e8a99b632c1cf25c2325d4.1623168703.git.gitgitgadget@gmail.com/
[2] https://lore.kernel.org/git/pull.988.v4.git.1626841444.gitgitgadget@gmail.com/

>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  stable-qsort.c | 16 +++++-----------
>  1 file changed, 5 insertions(+), 11 deletions(-)
>
> diff --git a/stable-qsort.c b/stable-qsort.c
> index 6cbaf39f7b..7ff12467cd 100644
> --- a/stable-qsort.c
> +++ b/stable-qsort.c
> @@ -48,15 +48,9 @@ void git_stable_qsort(void *b, size_t n, size_t s,
>                       int (*cmp)(const void *, const void *))
>  {
>         const size_t size = st_mult(n, s);
> -       char buf[1024];
> -
> -       if (size < sizeof(buf)) {
> -               /* The temporary array fits on the small on-stack buffer. */
> -               msort_with_tmp(b, n, s, cmp, buf);
> -       } else {
> -               /* It's somewhat large, so malloc it.  */
> -               char *tmp = xmalloc(size);
> -               msort_with_tmp(b, n, s, cmp, tmp);
> -               free(tmp);
> -       }
> +       char *tmp;
> +
> +       tmp = xmalloc(size);
> +       msort_with_tmp(b, n, s, cmp, tmp);
> +       free(tmp);
>  }
> --
> 2.34.1

Patch looks good to me.  Reviewed-by: Elijah Newren <newren@gmail.com>

  reply	other threads:[~2022-01-08  0:18 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-05 13:23 [PATCH] Properly align memory allocations and temporary buffers Jessica Clarke
2022-01-06 21:46 ` Taylor Blau
2022-01-06 21:56   ` Jessica Clarke
2022-01-06 22:27   ` Junio C Hamano
2022-01-06 22:56     ` Jessica Clarke
2022-01-07  0:10       ` Junio C Hamano
2022-01-07  0:22         ` Jessica Clarke
2022-01-07  0:31         ` brian m. carlson
2022-01-07  0:39           ` Jessica Clarke
2022-01-07  1:43             ` brian m. carlson
2022-01-07  2:08               ` Jessica Clarke
2022-01-07  2:11                 ` Jessica Clarke
2022-01-07 19:30               ` Junio C Hamano
2022-01-07 19:33                 ` Jessica Clarke
2022-01-07 20:56                 ` René Scharfe
2022-01-07 21:30                   ` Junio C Hamano
2022-01-07 23:30                     ` René Scharfe
2022-01-08  0:18                       ` Elijah Newren [this message]
2022-01-06 23:22 ` brian m. carlson
2022-01-06 23:31   ` Jessica Clarke
2022-01-07 14:57 ` Philip Oakley
2022-01-07 16:08 ` René Scharfe
2022-01-07 16:21   ` Jessica Clarke
2022-01-12 13:58 ` Jessica Clarke
2022-01-12 15:47   ` René Scharfe
2022-01-12 15:49     ` Jessica Clarke
2022-01-23 15:24 ` [PATCH v2] mem-pool: Don't assume uintmax_t is aligned enough for all types Jessica Clarke
2022-01-23 20:17   ` Junio C Hamano
2022-01-23 20:23     ` Jessica Clarke
2022-01-23 20:28       ` Junio C Hamano
2022-01-23 20:33   ` [PATCH v3] " Jessica Clarke
2022-01-24 17:11     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BG3rwC2fy-1fu0GwzofUjaF2HbJOC79BNBHxSoVS-vNAQ@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrtc27@jrtc27.com \
    --cc=l.s.r@web.de \
    --cc=me@ttaylorr.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).