Re: Collective wisdom about repos on NFS accessed by concurrent clients (== corruption!?)

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Kenneth Ölwing" <kenneth@olwing.se>
To: Git List <git@vger.kernel.org>
Subject: Re: Collective wisdom about repos on NFS accessed by concurrent clients (== corruption!?)
Date: Fri, 05 Apr 2013 14:35:40 +0200	[thread overview]
Message-ID: <515EC51C.9070206@olwing.se> (raw)
In-Reply-To: <515419D0.7030107@olwing.se>

Hi

Basically, I'm at a place where I'm considering giving up getting this 
to work reliably. In general, my setup work really fine, except for the 
itty-bitty detail that when I put pressure on things I tend to get into 
various kinds of trouble with the central repo being corrupted.

Can anyone authoritatively state anything either way?

TIA,

ken1

On 2013-03-28 11:22, Kenneth Ölwing wrote:
> Hi,
>
> I'm hoping to hear some wisdom on the subject so I can decide if I'm 
> chasing a pipe dream or if it should be expected to work and I just 
> need to work out the kinks.
>
> Finding things like this makes it sound possible:
>   http://permalink.gmane.org/gmane.comp.version-control.git/122670
> but then again, in threads like this:
>   http://kerneltrap.org/mailarchive/git/2010/11/14/44799
> opinions are that it's just not reliable to trust.
>
> My experience so far is that I eventually get repo corruption when I 
> stress it with concurrent read/write access from multiple hosts 
> (beyond any sort of likely levels, but still). Maybe I'm doing 
> something wrong, missing a configuration setting somewhere, put an 
> unfair stress on the system, there's a bona fide bug - or, given the 
> inherent difficulty in achieving perfect coherency between machines on 
> what's visible on the mount, it's just impossible (?) to truly get it 
> working under all situations.
>
> My eventual usecase is to set up a bunch of (gitolite) hosts that all 
> are effectively identical and all seeing the same storage and clients 
> can then be directed to any of them to get served. For the purpose of 
> this query I assume it can be boiled down to be the same as n users 
> working on their own workstations and accessing a central repo on an 
> NFS share they all have mounted, using regular file paths. Correct?
>
> I have a number of load-generating test scripts (that from humble 
> beginnings have grown to beasts), but basically, they fork a number of 
> code pieces that proceed to hammer the repo with concurrent reading 
> and writing. If necessary, the scripts can be started on different 
> hosts, too. It's all about the central repo so clients will retry in 
> various modes whenever they get an error, up to re-cloning and 
> starting over. All in all, they can yield quite a load.
>
> On a local filesystem everything seems to be holding up fine which is 
> expected. In the scenario with concurrency on top of shared NFS 
> storage however, the scripts eventually fails with various problems 
> (when the timing finally finds a hole, I guess) - possible to clone 
> but checkouts fails, errors about refs pointing wrong and errors where 
> the original repo is actually corrupted and can't be cloned from. 
> Depending on test strategy, I've also had everything going fine (ops 
> looks fine in my logs), but afterwards the repo has lost a branch or two.
>
> I've experimented with various NFS settings (e.g. turning off the 
> attribute cache), but haven't reached a conclusion. I do suspect that 
> it just is a fact of life with a remote filesystem to have coherency 
> problems with high concurrency like this but I'd be happily proven 
> wrong - I'm not an expert in either NFS or git.
>
> So, any opionions either way would be valuable, e.g. 'it should work' 
> or 'it can never work 100%' is fine, as well as any suggestions. 
> Obviously, the concurrency needed to make it probable to hit this 
> seems so unlikely that maybe I just shouldn't worry...
>
> I'd be happy to share scripts and whatever if someone would like to 
> try it out themselves.
>
> Thanks for your time,
>
> ken1
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2013-04-06 16:47 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-28 10:22 Collective wisdom about repos on NFS accessed by concurrent clients (== corruption!?) Kenneth Ölwing
2013-04-05 12:35 ` Kenneth Ölwing [this message]
2013-04-05 13:42   ` Thomas Rast
2013-04-05 14:45     ` Kenneth Ölwing
2013-04-06  8:11       ` Thomas Rast
2013-04-06 11:49         ` Jason Pyeron
2013-04-07 18:56           ` Kenneth Ölwing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=515EC51C.9070206@olwing.se \
    --to=kenneth@olwing.se \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).