git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Multiple clients accessing git over NFS
@ 2010-11-14 21:24 Khawaja Shams
  2010-11-14 23:11 ` Greg Troxel
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Khawaja Shams @ 2010-11-14 21:24 UTC (permalink / raw)
  To: git

  Is it a recommended practice to share a repository over NFS, where
multiple clients can be pushing changes simultaneously?  In our
production environment, we have a Git repository setup behind
git-http-backend. We would like to place multiple Apache servers
behind a load balancer to maximize availability and performance.
Before we proceed, we wanted to check to see if this practice has a
potential to cause repository corruption. If there are other ways
others have solved this problem, we would be very interested in
learning about those as well. Thank you.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Multiple clients accessing git over NFS
  2010-11-14 21:24 Multiple clients accessing git over NFS Khawaja Shams
@ 2010-11-14 23:11 ` Greg Troxel
  2010-11-14 23:42   ` Khawaja Shams
                     ` (2 more replies)
  2010-11-15  1:26 ` Sitaram Chamarty
  2010-11-16 13:47 ` Alex
  2 siblings, 3 replies; 10+ messages in thread
From: Greg Troxel @ 2010-11-14 23:11 UTC (permalink / raw)
  To: Khawaja Shams; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1243 bytes --]


Khawaja Shams <kshams@usc.edu> writes:

> Is it a recommended practice to share a repository over NFS, where
> multiple clients can be pushing changes simultaneously?  In our
> production environment, we have a Git repository setup behind
> git-http-backend. We would like to place multiple Apache servers
> behind a load balancer to maximize availability and performance.
> Before we proceed, we wanted to check to see if this practice has a
> potential to cause repository corruption. If there are other ways
> others have solved this problem, we would be very interested in
> learning about those as well. Thank you.

NFS locking has historically been problematic, and my impression is that
most people avoid it.  Perhaps it's ok on Solaris, but without serious
testing, I'd be worried.

Can you explain what you have set up, and what your performance
situation is, and why you think adding a second or third apache over NFS
will help?  How many users?  How many pushes/day?

One option is to have a multi-core box with tons of RAM running apache;
I've done that for trac (8 core, 16G, RAID5) because trac/python is so
piggy, and buying a $3K box was cheaper than making trac go faster.
That doesn't get you into remote FS locking issues.

[-- Attachment #2: Type: application/pgp-signature, Size: 194 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Multiple clients accessing git over NFS
  2010-11-14 23:11 ` Greg Troxel
@ 2010-11-14 23:42   ` Khawaja Shams
  2010-11-15  0:32     ` Jonathan Nieder
  2010-11-15 19:56     ` Jan Hudec
       [not found]   ` <AANLkTinX-XR2TaZPGPeWyekMq3e8wEDkfcmi_o6pTvMK@mail.gmail.com>
  2010-11-15 16:24   ` J. Bruce Fields
  2 siblings, 2 replies; 10+ messages in thread
From: Khawaja Shams @ 2010-11-14 23:42 UTC (permalink / raw)
  To: git

Hi Greg,
   Thank you for the insightful response. We have multiple automated
clients pushing and pulling changes from git as events occur. We have
not hit any real performance issues just yet. Our main goal is to
improve the availability of the repository in case the box running the
apache server has an outage during a mission critical period. Any
other ideas on how to accomplish this? From your remarks, it sounds
like putting the git repository on NFS, even with a single client, can
be problematic due to the locking issues. Is that what you meant?

   I am still interested in knowing if git can handle multiple
simultaneous pushes on the same repository without encountering
corruption issues. Thank you.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Multiple clients accessing git over NFS
       [not found]   ` <AANLkTinX-XR2TaZPGPeWyekMq3e8wEDkfcmi_o6pTvMK@mail.gmail.com>
@ 2010-11-14 23:46     ` Greg Troxel
  0 siblings, 0 replies; 10+ messages in thread
From: Greg Troxel @ 2010-11-14 23:46 UTC (permalink / raw)
  To: Khawaja Shams; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 699 bytes --]


If only one computer is accessing the repository, then failure to lock
may be ok.  But you'll still need atomic rename etc. to work.

I may be overly conservative, but I would not (and do not) allow anyone
to access a repository (cvs, svn, git, whatever) over NFS, ever.

My expectation is that multiple git processes on one machine with a repo
on local disk works fine.  If not it's a bug.  When you add a remote FS
you have to wonder if the unix filesystem sematics are preserved.

Another approach would be cloned repositories that constantly pull from
the main one or each other, and have people use those.  That will get
you delayed merge conflicts, but they might be useful as RO replicas.



[-- Attachment #2: Type: application/pgp-signature, Size: 194 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Multiple clients accessing git over NFS
  2010-11-14 23:42   ` Khawaja Shams
@ 2010-11-15  0:32     ` Jonathan Nieder
  2010-11-15 19:56     ` Jan Hudec
  1 sibling, 0 replies; 10+ messages in thread
From: Jonathan Nieder @ 2010-11-15  0:32 UTC (permalink / raw)
  To: Khawaja Shams; +Cc: git

Khawaja Shams wrote:

>    I am still interested in knowing if git can handle multiple
> simultaneous pushes on the same repository without encountering
> corruption issues.

Yes, concurrent attempts to update a branch are serialized.  (But
please don't ask me to answer about NFS semantics.  See

http://stackoverflow.com/questions/750765/concurrency-in-a-git-repo-on-a-network-shared-folder

for some notes.)  See the note about fast-fowards in the git push
manual for how integrity is preserved.

After reading that, you might wonder: if there are many, many clients
pushing to the same branch, how is starvation avoided?  Good question!
It isn't.  If you have so many clients wanting to push to a single
branch, I would suggest having a single person or a few people
maintaining it, pulling from others.  Life will be better for many
reasons, especially quality control.

Hope that helps.
Jonathan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Multiple clients accessing git over NFS
  2010-11-14 21:24 Multiple clients accessing git over NFS Khawaja Shams
  2010-11-14 23:11 ` Greg Troxel
@ 2010-11-15  1:26 ` Sitaram Chamarty
  2010-11-16 13:47 ` Alex
  2 siblings, 0 replies; 10+ messages in thread
From: Sitaram Chamarty @ 2010-11-15  1:26 UTC (permalink / raw)
  To: Khawaja Shams; +Cc: git

On Mon, Nov 15, 2010 at 2:54 AM, Khawaja Shams <kshams@usc.edu> wrote:
>   Is it a recommended practice to share a repository over NFS, where
> multiple clients can be pushing changes simultaneously?  In our

http://permalink.gmane.org/gmane.comp.version-control.git/122670

may be useful...

> production environment, we have a Git repository setup behind
> git-http-backend. We would like to place multiple Apache servers
> behind a load balancer to maximize availability and performance.
> Before we proceed, we wanted to check to see if this practice has a
> potential to cause repository corruption. If there are other ways
> others have solved this problem, we would be very interested in
> learning about those as well. Thank you.
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Sitaram

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Multiple clients accessing git over NFS
  2010-11-14 23:11 ` Greg Troxel
  2010-11-14 23:42   ` Khawaja Shams
       [not found]   ` <AANLkTinX-XR2TaZPGPeWyekMq3e8wEDkfcmi_o6pTvMK@mail.gmail.com>
@ 2010-11-15 16:24   ` J. Bruce Fields
  2 siblings, 0 replies; 10+ messages in thread
From: J. Bruce Fields @ 2010-11-15 16:24 UTC (permalink / raw)
  To: Greg Troxel; +Cc: Khawaja Shams, git

On Sun, Nov 14, 2010 at 06:11:41PM -0500, Greg Troxel wrote:
> 
> Khawaja Shams <kshams@usc.edu> writes:
> 
> > Is it a recommended practice to share a repository over NFS, where
> > multiple clients can be pushing changes simultaneously?  In our
> > production environment, we have a Git repository setup behind
> > git-http-backend. We would like to place multiple Apache servers
> > behind a load balancer to maximize availability and performance.
> > Before we proceed, we wanted to check to see if this practice has a
> > potential to cause repository corruption. If there are other ways
> > others have solved this problem, we would be very interested in
> > learning about those as well. Thank you.
> 
> NFS locking has historically been problematic, and my impression is that
> most people avoid it.  Perhaps it's ok on Solaris, but without serious
> testing, I'd be worried.

Does git actually do file locking when people push to a bare repo?

If all it needs is for rename and/or O_EXCL to be atomic--that should be
fine over NFS.

--b.

> 
> Can you explain what you have set up, and what your performance
> situation is, and why you think adding a second or third apache over NFS
> will help?  How many users?  How many pushes/day?
> 
> One option is to have a multi-core box with tons of RAM running apache;
> I've done that for trac (8 core, 16G, RAID5) because trac/python is so
> piggy, and buying a $3K box was cheaper than making trac go faster.
> That doesn't get you into remote FS locking issues.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Multiple clients accessing git over NFS
  2010-11-14 23:42   ` Khawaja Shams
  2010-11-15  0:32     ` Jonathan Nieder
@ 2010-11-15 19:56     ` Jan Hudec
  2010-11-15 20:44       ` Drew Northup
  1 sibling, 1 reply; 10+ messages in thread
From: Jan Hudec @ 2010-11-15 19:56 UTC (permalink / raw)
  To: Khawaja Shams; +Cc: git

On Sun, Nov 14, 2010 at 15:42:29 -0800, Khawaja Shams wrote:
> Hi Greg,
>    Thank you for the insightful response. We have multiple automated
> clients pushing and pulling changes from git as events occur. We have
> not hit any real performance issues just yet. Our main goal is to
> improve the availability of the repository in case the box running the
> apache server has an outage during a mission critical period.

If you are out for availability, NFS isn't an answer, because the NFS server
remains a single point of failure. There are distributed filesystems
(Gluster, Lustre etc.) that can provide redundancy of storage nodes too or
you could have shared storage array with appropriate filesystem (GlobalFS,
OCFS2, etc.), but that requires special hardware. These will probably give
you better performance too -- git network protocol is optimized to send
minimal data, but that often means a lot more needs to be read from the disk.

I don't have personal experience with them though, so I can't give you more
specific recommendation.

-- 
						 Jan 'Bulb' Hudec <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Multiple clients accessing git over NFS
  2010-11-15 19:56     ` Jan Hudec
@ 2010-11-15 20:44       ` Drew Northup
  0 siblings, 0 replies; 10+ messages in thread
From: Drew Northup @ 2010-11-15 20:44 UTC (permalink / raw)
  To: Khawaja Shams; +Cc: git, Jan Hudec


On Mon, 2010-11-15 at 20:56 +0100, Jan Hudec wrote:
> On Sun, Nov 14, 2010 at 15:42:29 -0800, Khawaja Shams wrote:
> > Hi Greg,
> >    Thank you for the insightful response. We have multiple automated
> > clients pushing and pulling changes from git as events occur. We have
> > not hit any real performance issues just yet. Our main goal is to
> > improve the availability of the repository in case the box running the
> > apache server has an outage during a mission critical period.
> 
> If you are out for availability, NFS isn't an answer, because the NFS server
> remains a single point of failure. There are distributed filesystems
> (Gluster, Lustre etc.) that can provide redundancy of storage nodes too or
> you could have shared storage array with appropriate filesystem (GlobalFS,
> OCFS2, etc.), but that requires special hardware. These will probably give
> you better performance too -- git network protocol is optimized to send
> minimal data, but that often means a lot more needs to be read from the disk.
> 
> I don't have personal experience with them though, so I can't give you more
> specific recommendation.

Khawaja,
I haven't tried setting a server up with it yet, but perhaps DRDB
mirrored devices may be of use? At that point then you have a way of
making all of your HTTPd instances "see" the same filesystem (and will
have notification options for when they do not). It probably isn't
perfect, but may be worth looking into if your SAN cannot provide
downtime-less NFS. As an added benefit, there is no longer a requirement
that all of your front-ends be co-located (physically or logically).

-- 
-Drew Northup N1XIM
   AKA RvnPhnx on OPN
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Multiple clients accessing git over NFS
  2010-11-14 21:24 Multiple clients accessing git over NFS Khawaja Shams
  2010-11-14 23:11 ` Greg Troxel
  2010-11-15  1:26 ` Sitaram Chamarty
@ 2010-11-16 13:47 ` Alex
  2 siblings, 0 replies; 10+ messages in thread
From: Alex @ 2010-11-16 13:47 UTC (permalink / raw)
  To: git


Khawaja Shams <kshams <at> usc.edu> writes:

> 
>   Is it a recommended practice to share a repository over NFS, where
> multiple clients can be pushing changes simultaneously?  In our
> production environment, we have a Git repository setup behind
> git-http-backend. We would like to place multiple Apache servers
> behind a load balancer to maximize availability and performance.
> Before we proceed, we wanted to check to see if this practice has a
> potential to cause repository corruption. If there are other ways
> others have solved this problem, we would be very interested in
> learning about those as well. Thank you.
> 


Others have commented on the git aspects of this, but FYI there is a handy
program here: http://www.unixcoding.org/NFSCoding#NFS_Cache_Tester
that tests aspects of your NFS implementation. (Sadly the one we have
at work is crap, or at least it was last time I ran the program).

Alex
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-11-16 13:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-14 21:24 Multiple clients accessing git over NFS Khawaja Shams
2010-11-14 23:11 ` Greg Troxel
2010-11-14 23:42   ` Khawaja Shams
2010-11-15  0:32     ` Jonathan Nieder
2010-11-15 19:56     ` Jan Hudec
2010-11-15 20:44       ` Drew Northup
     [not found]   ` <AANLkTinX-XR2TaZPGPeWyekMq3e8wEDkfcmi_o6pTvMK@mail.gmail.com>
2010-11-14 23:46     ` Greg Troxel
2010-11-15 16:24   ` J. Bruce Fields
2010-11-15  1:26 ` Sitaram Chamarty
2010-11-16 13:47 ` Alex

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).