Re: Collective wisdom about repos on NFS accessed by concurrent clients (== corruption!?)

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Kenneth Ölwing" <kenneth@olwing.se>
To: Git List <git@vger.kernel.org>
Subject: Re: Collective wisdom about repos on NFS accessed by concurrent clients (== corruption!?)
Date: Fri, 05 Apr 2013 14:35:40 +0200	[thread overview]
Message-ID: <515EC51C.9070206@olwing.se> (raw)
In-Reply-To: <515419D0.7030107@olwing.se>

Hi

Basically, I'm at a place where I'm considering giving up getting this 
to work reliably. In general, my setup work really fine, except for the 
itty-bitty detail that when I put pressure on things I tend to get into 
various kinds of trouble with the central repo being corrupted.

Can anyone authoritatively state anything either way?

TIA,

ken1

On 2013-03-28 11:22, Kenneth Ölwing wrote:
> Hi,
>
> I'm hoping to hear some wisdom on the subject so I can decide if I'm 
> chasing a pipe dream or if it should be expected to work and I just 
> need to work out the kinks.
>
> Finding things like this makes it sound possible:
>   http://permalink.gmane.org/gmane.comp.version-control.git/122670
> but then again, in threads like this:
>   http://kerneltrap.org/mailarchive/git/2010/11/14/44799
> opinions are that it's just not reliable to trust.
>
> My experience so far is that I eventually get repo corruption when I 
> stress it with concurrent read/write access from multiple hosts 
> (beyond any sort of likely levels, but still). Maybe I'm doing 
> something wrong, missing a configuration setting somewhere, put an 
> unfair stress on the system, there's a bona fide bug - or, given the 
> inherent difficulty in achieving perfect coherency between machines on 
> what's visible on the mount, it's just impossible (?) to truly get it 
> working under all situations.
>
> My eventual usecase is to set up a bunch of (gitolite) hosts that all 
> are effectively identical and all seeing the same storage and clients 
> can then be directed to any of them to get served. For the purpose of 
> this query I assume it can be boiled down to be the same as n users 
> working on their own workstations and accessing a central repo on an 
> NFS share they all have mounted, using regular file paths. Correct?
>
> I have a number of load-generating test scripts (that from humble 
> beginnings have grown to beasts), but basically, they fork a number of 
> code pieces that proceed to hammer the repo with concurrent reading 
> and writing. If necessary, the scripts can be started on different 
> hosts, too. It's all about the central repo so clients will retry in 
> various modes whenever they get an error, up to re-cloning and 
> starting over. All in all, they can yield quite a load.
>
> On a local filesystem everything seems to be holding up fine which is 
> expected. In the scenario with concurrency on top of shared NFS 
> storage however, the scripts eventually fails with various problems 
> (when the timing finally finds a hole, I guess) - possible to clone 
> but checkouts fails, errors about refs pointing wrong and errors where 
> the original repo is actually corrupted and can't be cloned from. 
> Depending on test strategy, I've also had everything going fine (ops 
> looks fine in my logs), but afterwards the repo has lost a branch or two.
>
> I've experimented with various NFS settings (e.g. turning off the 
> attribute cache), but haven't reached a conclusion. I do suspect that 
> it just is a fact of life with a remote filesystem to have coherency 
> problems with high concurrency like this but I'd be happily proven 
> wrong - I'm not an expert in either NFS or git.
>
> So, any opionions either way would be valuable, e.g. 'it should work' 
> or 'it can never work 100%' is fine, as well as any suggestions. 
> Obviously, the concurrency needed to make it probable to hit this 
> seems so unlikely that maybe I just shouldn't worry...
>
> I'd be happy to share scripts and whatever if someone would like to 
> try it out themselves.
>
> Thanks for your time,
>
> ken1
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2013-04-06 16:47 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-28 10:22 Collective wisdom about repos on NFS accessed by concurrent clients (== corruption!?) Kenneth Ölwing
2013-04-05 12:35 ` Kenneth Ölwing [this message]
2013-04-05 13:42   ` Thomas Rast
2013-04-05 14:45     ` Kenneth Ölwing
2013-04-06  8:11       ` Thomas Rast
2013-04-06 11:49         ` Jason Pyeron
2013-04-07 18:56           ` Kenneth Ölwing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=515EC51C.9070206@olwing.se \
    --to=kenneth@olwing.se \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.