git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] `git push` sends unnecessary objects
@ 2023-09-13 22:59 Javier Mora
  2023-09-17 13:21 ` Bagas Sanjaya
  0 siblings, 1 reply; 4+ messages in thread
From: Javier Mora @ 2023-09-13 22:59 UTC (permalink / raw)
  To: git

I came across this issue accidentally when trying to move a directory
containing a very large file, and deleting another file in that
directory while I was at it.
It seems to be caused by `pack.useSparse=true` being the default since
v2.27 (which I found out after spending quite a while manually
bisecting and compiling git since I noticed that this didn't happen in
v2.25; commit de3a864 introduces this regression).

* Expected:
    Pushing a commit that moves a file without modifying it shouldn't
require sending a blob object for that file, since the remote server
already has that blob object.
* Observed:
    Pushing a commit that moves a directory containing a file and also
adds/deletes other files in that directory will for some reason also
send blobs for all the files in that directory, even the ones that
were already in the remote.
* Consequences:
    This has a very big impact in push times for very small commits
that just move around files, if those files are very big (I had this
happen with a >100MB file over a problematic connection... yikes!)
* Note:
    The commit introducing the regression does warn about possible
scenarios involving a special arrangement of exact copies across
directories, but these are not "copies", I just moved a file, which
seems like a rather common operation.

Code snippet for reproduction:
```
mkdir TEST_git
cd TEST_git

mkdir -p local remote/origin.git
cd remote/origin.git
git init --bare
cd ../../local
git init
git remote add origin file://"${PWD%/*}"/remote/origin.git

mkdir zig
for i in a b c d e; do
    dd if=/dev/urandom of=zig/"$i" bs=1M count=1
done
git add .
git commit -m 'Add big files'
git push -u origin master
#>> Writing objects: 100% (8/8), 5.00 MiB | 13.27 MiB/s, done.
#^ makes sense: 1 commit + 2 trees (/ and /zig) + 5 files = 8;
#  5 MiB in total for the 5x 1 MiB binary files

git mv zig zag
git commit -m 'Move zig'
git push
#>> Writing objects: 100% (2/2), 233 bytes | 233.00 KiB/s, done.
#^ makes sense: 1 commit + 1 tree (/ renames /zig to /zag) = 2;
#  a,b,c,d,e objects already in remote

git mv zag zog
touch zog/f
git add zog/f
git commit -m 'For great justice'
git push
#>> Writing objects: 100% (9/9), 5.00 MiB | 24.63 MiB/s, done.
#^ It re-uploaded the 5x 1 MiB blobs
#  even though remote already had them.
```

Note that the latter doesn't happen if I use `git -c pack.useSparse=false push`.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-11-30 13:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-13 22:59 [BUG] `git push` sends unnecessary objects Javier Mora
2023-09-17 13:21 ` Bagas Sanjaya
2023-11-25 14:54   ` Javier Mora
     [not found]     ` <PH0PR00MB1349BB447657A94A8EA90A78A183A@PH0PR00MB1349.namprd00.prod.outlook.com>
2023-11-30 13:33       ` [EXTERNAL] " Javier Mora

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).