Re: git-fetch takes forever on a slow network link. Can parallel mode help?

public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed

From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: "R. Diez" <rdiez-2006@rd10.de>
Cc: git@vger.kernel.org
Subject: Re: git-fetch takes forever on a slow network link. Can parallel mode help?
Date: Fri, 6 Mar 2026 20:54:16 +0000	[thread overview]
Message-ID: <aas--JZ-CCWN-o7O@fruit.crustytoothpaste.net> (raw)
In-Reply-To: <5c7c975e-2541-47e1-b789-fee1fdb77d2a@rd10.de>

[-- Attachment #1: Type: text/plain, Size: 3484 bytes --]

On 2026-03-06 at 20:13:58, R. Diez wrote:
> Hi all:

Hey,

> I have an SMB/CIFS connection to a file server over a slow link of about 1 Mbps download, and a faster upload of about 10 Mbps.
> 
> My smallish Git repository has its single origin on that file server. Unfortunately, I cannot set up any sort of Git server on the remote host.
> 
> git fetch takes a long time. If the repository is up to date, it takes about 25 seconds to realise that there is nothing to do.
> 
> If there are changes to download, it can take half an hour, even if the new commit history is rather small.
> 
> The network link is slow, but not that slow. I wonder what may be causing the long delays.
> 
> The first question is: how come it takes so long to determine that nothing has changed? Does git-fetch need to download a biggish file every time?

1 Mbps is considered extremely slow for a modern disk.  A floppy disk
was 250 kbps[0], so your speed is about four times that of a floppy
disk.  Hard disks in 1998 were about 10 MB/s[1], so about 80 times that
speed.  That's definitely a big part of the problem.

Since this is presumably a bare repository, Git will first read the
remote references to determine what's available, so if you're using the
default files backend, it will read each of the refs, which may involve
many small network requests.  This performance could be improved with
`git pack-refs` or by converting to the reftable backend, which will
open fewer files.  reftable also uses some simple compression for ref
names, which will help as well, but it requires a relatively recent Git.
`git refs migrate` can be used to convert to reftable if you like.

Once Git knows what the remote repository's refs are, it will need to
walk the history to find out what it does and doesn't have.  If there
are many lines of development, then Git will do more work; if there is
just one main branch to fetch, then there will be less.  This will
involve opening every loose commit or tag object or reading every packed
commit or tag object in the history path to determine what needs to be
copied.  If there's nothing to copy, then Git can determine that from
the refs and won't walk any history or copy any objects.

If you _do_ have to transfer data, I'm not sure whether having the data
packed or loose will be more efficient in your case due to the slow
speed.  You can try packing the repository with `git gc` and see how
that affects future transfers.  If latency is the cost, then packing
will almost certainly be more efficient.

You can also see how long various operations take by using
`GIT_TRACE2=1`, which will give some detailed timing information that
will help you see what the expensive parts are.

If you have some trace output showing timings, we can advise on what you
might do to help us address performance.

> However, the git-fetch documentation does not clearly state whether the parallel mode only helps if you have multiple remotes and/or multiple submodules. In my case, I just have a single repository with a single origin and no submodules.

Parallel mode does not help with a single remote.  All the data for a
single remote comes in one job.

[0] https://stackoverflow.com/questions/52841124/how-fast-could-you-read-write-to-floppy-disks-both-3-1-4-and-5-1-2
[1] https://goughlui.com/the-hard-disk-corner/hard-drive-performance-over-the-years/
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

next prev parent reply	other threads:[~2026-03-06 20:54 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-06 20:13 git-fetch takes forever on a slow network link. Can parallel mode help? R. Diez
2026-03-06 20:54 ` brian m. carlson [this message]
2026-03-07 21:28   ` R. Diez
2026-03-08  1:44     ` brian m. carlson
2026-03-08 21:08       ` R. Diez
2026-03-08 22:52         ` brian m. carlson
2026-03-09 21:08           ` R. Diez
2026-03-10 22:50             ` brian m. carlson
2026-03-11 18:05               ` R. Diez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aas--JZ-CCWN-o7O@fruit.crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=git@vger.kernel.org \
    --cc=rdiez-2006@rd10.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox