All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Derrick Stolee <derrickstolee@github.com>
Subject: [PATCH v2 0/5] fetch: more optimizations for mirror fetches
Date: Tue, 1 Mar 2022 10:33:33 +0100	[thread overview]
Message-ID: <cover.1646127015.git.ps@pks.im> (raw)
In-Reply-To: <cover.1645619224.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 6044 bytes --]

Hi,

this is another patch series with the aim to speed up mirror fetches. It
applies on top of e6ebfd0e8c (The sixth batch, 2022-02-18) with
3824153b23 (Merge branch 'ps/fetch-atomic' into next, 2022-02-18) merged
into it to fix a conflict.

The only change compared to v2 is an update to the benchmarks so that
they're less verbose, as proposed by Derrick. I also had a look at
introducing a new helper `parse_object_probably_commit()`, but I didn't
find the end result to be much of an improvement compared to the ad-hoc
`lookup_commit_in_graph() || parse_object()` dance we do right now.

Thanks!

Patrick

Patrick Steinhardt (5):
  upload-pack: look up "want" lines via commit-graph
  fetch: avoid lookup of commits when not appending to FETCH_HEAD
  refs: add ability for backends to special-case reading of symbolic
    refs
  remote: read symbolic refs via `refs_read_symbolic_ref()`
  refs/files-backend: optimize reading of symbolic refs

 builtin/fetch.c       | 42 +++++++++++++++++++++++++++---------------
 builtin/remote.c      |  8 +++++---
 refs.c                | 17 +++++++++++++++++
 refs.h                |  3 +++
 refs/debug.c          |  1 +
 refs/files-backend.c  | 33 ++++++++++++++++++++++++++++-----
 refs/packed-backend.c |  1 +
 refs/refs-internal.h  | 16 ++++++++++++++++
 remote.c              | 14 +++++++-------
 upload-pack.c         | 20 +++++++++++++++++---
 10 files changed, 122 insertions(+), 33 deletions(-)

Range-diff against v1:
1:  ca5e136cca ! 1:  b5c696bd8e upload-pack: look up "want" lines via commit-graph
    @@ Commit message
         Refactor parsing of both "want" and "want-ref" lines to do so.
     
         The following benchmark is executed in a repository with a huge number
    -    of references. It uses cached request from git-fetch(1) as input and
    -    contains about 876,000 "want" lines:
    +    of references. It uses cached request from git-fetch(1) as input to
    +    git-upload-pack(1) that contains about 876,000 "want" lines:
     
    -        Benchmark 1: git-upload-pack (HEAD~)
    +        Benchmark 1: HEAD~
               Time (mean ± σ):      7.113 s ±  0.028 s    [User: 6.900 s, System: 0.662 s]
               Range (min … max):    7.072 s …  7.168 s    10 runs
     
    -        Benchmark 2: git-upload-pack (HEAD)
    +        Benchmark 2: HEAD
               Time (mean ± σ):      6.622 s ±  0.061 s    [User: 6.452 s, System: 0.650 s]
               Range (min … max):    6.535 s …  6.727 s    10 runs
     
             Summary
    -          'git-upload-pack (HEAD)' ran
    -            1.07 ± 0.01 times faster than 'git-upload-pack (HEAD~)'
    +          'HEAD' ran
    +            1.07 ± 0.01 times faster than 'HEAD~'
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
2:  80f993dddd ! 2:  fbe76b78c3 fetch: avoid lookup of commits when not appending to FETCH_HEAD
    @@ Commit message
     
         Skip this busywork in case we're not writing to FETCH_HEAD. The
         following benchmark performs a mirror-fetch in a repository with about
    -    two million references:
    +    two million references via `git fetch --prune --no-write-fetch-head
    +    +refs/*:refs/*`:
     
    -        Benchmark 1: git fetch --prune --no-write-fetch-head +refs/*:refs/* (HEAD~)
    +        Benchmark 1: HEAD~
               Time (mean ± σ):     75.388 s ±  1.942 s    [User: 71.103 s, System: 8.953 s]
               Range (min … max):   73.184 s … 76.845 s    3 runs
     
    -        Benchmark 2: git fetch --prune --no-write-fetch-head +refs/*:refs/* (HEAD)
    +        Benchmark 2: HEAD
               Time (mean ± σ):     69.486 s ±  1.016 s    [User: 65.941 s, System: 8.806 s]
               Range (min … max):   68.864 s … 70.659 s    3 runs
     
             Summary
    -          'git fetch --prune --no-write-fetch-head +refs/*:refs/* (HEAD)' ran
    -            1.08 ± 0.03 times faster than 'git fetch --prune --no-write-fetch-head +refs/*:refs/* (HEAD~)'
    +          'HEAD' ran
    +            1.08 ± 0.03 times faster than 'HEAD~'
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
3:  28cacbdbe2 = 3:  29eb81d37c refs: add ability for backends to special-case reading of symbolic refs
4:  1d24101fe4 = 4:  0489380e00 remote: read symbolic refs via `refs_read_symbolic_ref()`
5:  7213ffdbdd ! 5:  b6eca63d3b refs/files-backend: optimize reading of symbolic refs
    @@ Commit message
         need to skip updating local symbolic references during a fetch, which is
         why the change results in a significant speedup when doing fetches in
         repositories with huge numbers of references. The following benchmark
    -    executes a mirror-fetch in a repository with about 2 million references:
    +    executes a mirror-fetch in a repository with about 2 million references
    +    via `git fetch --prune --no-write-fetch-head +refs/*:refs/*`:
     
    -        Benchmark 1: git fetch --prune --no-write-fetch-head +refs/*:refs/* (HEAD~)
    +        Benchmark 1: HEAD~
               Time (mean ± σ):     68.372 s ±  2.344 s    [User: 65.629 s, System: 8.786 s]
               Range (min … max):   65.745 s … 70.246 s    3 runs
     
    -        Benchmark 2: git fetch --prune --no-write-fetch-head +refs/*:refs/* (HEAD)
    +        Benchmark 2: HEAD
               Time (mean ± σ):     60.259 s ±  0.343 s    [User: 61.019 s, System: 7.245 s]
               Range (min … max):   60.003 s … 60.649 s    3 runs
     
             Summary
    -          'git fetch --prune --no-write-fetch-head +refs/*:refs/* (HEAD)' ran
    -            1.13 ± 0.04 times faster than 'git fetch --prune --no-write-fetch-head +refs/*:refs/* (HEAD~)'
    +          'HEAD' ran
    +            1.13 ± 0.04 times faster than 'HEAD~'
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
-- 
2.35.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2022-03-01  9:34 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-23 12:35 [PATCH 0/5] fetch: more optimizations for mirror fetches Patrick Steinhardt
2022-02-23 12:35 ` [PATCH 1/5] upload-pack: look up "want" lines via commit-graph Patrick Steinhardt
2022-02-23 14:13   ` Derrick Stolee
2022-03-01  8:43     ` Patrick Steinhardt
2022-03-01  9:24       ` Patrick Steinhardt
2022-03-02 18:53         ` Derrick Stolee
2022-02-23 12:35 ` [PATCH 2/5] fetch: avoid lookup of commits when not appending to FETCH_HEAD Patrick Steinhardt
2022-02-23 14:18   ` Derrick Stolee
2022-03-01  8:44     ` Patrick Steinhardt
2022-02-23 12:35 ` [PATCH 3/5] refs: add ability for backends to special-case reading of symbolic refs Patrick Steinhardt
2022-02-23 12:35 ` [PATCH 4/5] remote: read symbolic refs via `refs_read_symbolic_ref()` Patrick Steinhardt
2022-02-23 12:35 ` [PATCH 5/5] refs/files-backend: optimize reading of symbolic refs Patrick Steinhardt
2022-03-01  9:33 ` Patrick Steinhardt [this message]
2022-03-01  9:33   ` [PATCH v2 1/5] upload-pack: look up "want" lines via commit-graph Patrick Steinhardt
2022-03-01  9:33   ` [PATCH v2 2/5] fetch: avoid lookup of commits when not appending to FETCH_HEAD Patrick Steinhardt
2022-03-01  9:33   ` [PATCH v2 3/5] refs: add ability for backends to special-case reading of symbolic refs Patrick Steinhardt
2022-03-01  9:33   ` [PATCH v2 4/5] remote: read symbolic refs via `refs_read_symbolic_ref()` Patrick Steinhardt
2022-03-01  9:33   ` [PATCH v2 5/5] refs/files-backend: optimize reading of symbolic refs Patrick Steinhardt
2022-03-01 22:02   ` [PATCH v2 0/5] fetch: more optimizations for mirror fetches Junio C Hamano
2022-03-02 18:54   ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1646127015.git.ps@pks.im \
    --to=ps@pks.im \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.