* Does "git push" open a pack for read before closing it? @ 2018-12-21 12:46 git-mailinglist 2018-12-22 23:12 ` brian m. carlson 0 siblings, 1 reply; 3+ messages in thread From: git-mailinglist @ 2018-12-21 12:46 UTC (permalink / raw) To: git [Major ignorance alert] I'm writing software to implement a FUSE mount for a decentralised file system and during testing with git I see some strange behaviour which I'd like to investigate. It might be a bug in my code, or even the FUSE lib I'm using, or it might be intended behaviour by git. So one thing I'd like to do is check if this is expected in git. SYSTEM OS: Ubuntu 18.10 git version 2.19.1 Decentralised storage mounted at ~/SAFE What I'm doing I'm testing my FUSE implementation for SAFE Network while exploring the use of git with decentralised storage, so not necessarily in a sensible arrangement (comments on that also welcome). I have a folder at ~/SAFE/_public/tests/data1/ and want to create a bare repo there to use as a remote from my local drive for an existing git repo at ~/src/safe/sjs.git Anyway, I do the following sequence of commands which are all fine up until the last one which eventually fails: cd ~/SAFE/_public/tests/data1 git init --bare blah cd ~/src/safe/sjs.git git remote remove origin git remote add origin ~/SAFE/_public/tests/data1/blah git push origin master Here's the output from the last command above: Enumerating objects: 373, done. Counting objects: 100% (373/373), done. Delta compression using up to 8 threads Compressing objects: 100% (371/371), done. Writing objects: 100% (373/373), 187.96 KiB | 33.00 KiB/s, done. Total 373 (delta 254), reused 0 (delta 0) remote: fatal: unable to open /home/mrh/SAFE/_public/tests/data1/blah/./objects/incoming-73lbb6/pack/tmp_pack_pL28kQ: Remote I/O error error: remote unpack failed: index-pack abnormal exit To /home/mrh/SAFE/_public/tests/data1/blah ! [remote rejected] master -> master (unpacker error) error: failed to push some refs to '/home/mrh/SAFE/_public/tests/data1/blah' Inspecting the logs from my FUSE implementation I see that there's a problem related to this file on the mounted storage: /_public/tests/data1/blah/objects/incoming-73lbb6/pack/tmp_pack_pL28kQ Prior to the error the file is written to multiple times by git - all good (about 200kB in all). Then, before the file is closed I see an attempt to open it for read, which fails. The failure is because I don't support read on a file that is open for write yet, and I'm not sure if that is sensible or what git might be expecting to do given the file has not even been flushed to disk at this point. So I'd like to know if this is expected behaviour by git (or where to look to find out), and if it is expected, then what might git expect to do if the file were opened successfully? N.B. After the failure, the file is closed and then deleted! Also note that it is possible the behaviour I'm seeing is not really git but another issue, such as a bug in the sync/async aspect of my code. Thanks Mark -- Secure Access For Everyone: - SAFE Network - First Autonomous Decentralised Internet https://safenetwork.tech ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Does "git push" open a pack for read before closing it? 2018-12-21 12:46 Does "git push" open a pack for read before closing it? git-mailinglist @ 2018-12-22 23:12 ` brian m. carlson 2019-01-07 15:56 ` git-mailinglist 0 siblings, 1 reply; 3+ messages in thread From: brian m. carlson @ 2018-12-22 23:12 UTC (permalink / raw) To: git-mailinglist; +Cc: git [-- Attachment #1: Type: text/plain, Size: 2681 bytes --] On Fri, Dec 21, 2018 at 12:46:35PM +0000, git-mailinglist@happybeing.com wrote: > Here's the output from the last command above: > > Enumerating objects: 373, done. > Counting objects: 100% (373/373), done. > Delta compression using up to 8 threads > Compressing objects: 100% (371/371), done. > Writing objects: 100% (373/373), 187.96 KiB | 33.00 KiB/s, done. > Total 373 (delta 254), reused 0 (delta 0) > remote: fatal: unable to open > /home/mrh/SAFE/_public/tests/data1/blah/./objects/incoming-73lbb6/pack/tmp_pack_pL28kQ: > Remote I/O error > error: remote unpack failed: index-pack abnormal exit > To /home/mrh/SAFE/_public/tests/data1/blah > ! [remote rejected] master -> master (unpacker error) > error: failed to push some refs to '/home/mrh/SAFE/_public/tests/data1/blah' > > Inspecting the logs from my FUSE implementation I see that there's a > problem related to this file on the mounted storage: > > /_public/tests/data1/blah/objects/incoming-73lbb6/pack/tmp_pack_pL28kQ > > Prior to the error the file is written to multiple times by git - all > good (about 200kB in all). Then, before the file is closed I see an > attempt to open it for read, which fails. The failure is because I don't > support read on a file that is open for write yet, and I'm not sure if > that is sensible or what git might be expecting to do given the file has > not even been flushed to disk at this point. What I expect is happening is that Git receives the objects and writes them to a temporary file (which you see in "objects/incoming") and then they're passed to either git unpack-objects or git index-pack, which then attempts to read it. > So I'd like to know if this is expected behaviour by git (or where to > look to find out), and if it is expected, then what might git expect to > do if the file were opened successfully? This behavior is expected. POSIX says that a read that can be proved to have occurred after a write must contain the new data, so it's possible that a separate process may choose to read the file and index it, knowing that the index process was started after all the writes. This is definitely an important invariant to preserve if your FUSE file system is going to be used on a Unix system. In other words, consistency (in the CAP sense) is required. > N.B. After the failure, the file is closed and then deleted! Right, if this had succeeded, we would have renamed it into place (or unpacked it and deleted it), but since it failed, we clean up after ourselves so as not to leave large temporary files around. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 868 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Does "git push" open a pack for read before closing it? 2018-12-22 23:12 ` brian m. carlson @ 2019-01-07 15:56 ` git-mailinglist 0 siblings, 0 replies; 3+ messages in thread From: git-mailinglist @ 2019-01-07 15:56 UTC (permalink / raw) To: git; +Cc: brian m. carlson On 22/12/2018 23:12, brian m. carlson wrote: Thanks Brian, you helped me make some progress. I'm stuck again trying to understand git behaviour though and wondering if there are better ways of me seeing into git (source, debug o/p etc) than posting here. As a reminder, I'm doing the following to create a bare repository on my FUSE mounted decentralised storage: cd ~/SAFE/_public/tests/data1 git init --bare blah cd ~/src/safe/sjs.git git remote remove origin git remote add origin ~/SAFE/_public/tests/data1/blah git push origin master The bugs are in my implementation of FUSE on the SAFE storage. I get additional output from git using the following (but it doesn't help me): set -x; GIT_TRACE=2 GIT_CURL_VERBOSE=2 GIT_TRACE_PERFORMANCE=2 \ GIT_TRACE_PACK_ACCESS=2 GIT_TRACE_PACKET=2 GIT_TRACE_PACKFILE=2 \ GIT_TRACE_SETUP=2 GIT_TRACE_SHALLOW=2 git push origin master -v -v \ 2>&1 |tee ~/git-trace.log; set +x Anyway, to add a little to your observations... > What I expect is happening is that Git receives the objects and writes > them to a temporary file (which you see in "objects/incoming") and then > they're passed to either git unpack-objects or git index-pack, which > then attempts to read it. The git console output seems to confirm it is 'git index-pack' that encounters the error, which is currently: Enumerating objects: 373, done. Counting objects: 100% (373/373), done. Delta compression using up to 8 threads Compressing objects: 100% (371/371), done. Writing objects: 100% (373/373), 192.43 KiB | 54.00 KiB/s, done. Total 373 (delta 255), reused 0 (delta 0) remote: fatal: premature end of pack file, 36 bytes missing remote: fatal: premature end of pack file, 65 bytes missing error: remote unpack failed: index-pack abnormal exit To /home/mrh/SAFE/_public/tests/data1/blah ! [remote rejected] master -> master (unpacker error) error: failed to push some refs to '/home/mrh/SAFE/_public/tests/data/blah' So I conclude I'm either not writing the file properly, or not reading it back properly. I can continue looking into that of course, but looking at the file requests I'm curious about what git is doing and how to learn more about it as it looks odd. I have quite a few questions, but will focus on just the point at which it bails out. In summary, what I see is: - The pack file is created and written with multiple calls, ending up about 200k long. - While still open for write, it is opened *four* times, so git has five handles active on it. One write and four read. - At this point I see the following FUSE read operation: read('/_public/tests/data1/blah/objects/incoming-quFPHB /pack/tmp_pack_E4ea92', 58, buf, 4096, 16384) 58 is the file handle, 4096 the length of buf, and 16384 the position - Presumably this is where git encounters a problem because it then closes everything and cleans up the incoming directory. It seems odd to me that it is starting to read the pack file at position 16384 rather than at 0 (or at 12 after the header). I can surmise it might open it four times to speed access, but would expect to see it read the beginning of the file (or at position 12) before trying to interpret the content and bailing out. So I'm wondering what git is doing there. Any comments on this, or a pointer to the relevant git code so I can look myself would be great. Thanks, Mak ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-01-07 15:56 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-12-21 12:46 Does "git push" open a pack for read before closing it? git-mailinglist 2018-12-22 23:12 ` brian m. carlson 2019-01-07 15:56 ` git-mailinglist
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox