* Multiple clients accessing git over NFS @ 2010-11-14 21:24 Khawaja Shams 2010-11-14 23:11 ` Greg Troxel ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Khawaja Shams @ 2010-11-14 21:24 UTC (permalink / raw) To: git Is it a recommended practice to share a repository over NFS, where multiple clients can be pushing changes simultaneously? In our production environment, we have a Git repository setup behind git-http-backend. We would like to place multiple Apache servers behind a load balancer to maximize availability and performance. Before we proceed, we wanted to check to see if this practice has a potential to cause repository corruption. If there are other ways others have solved this problem, we would be very interested in learning about those as well. Thank you. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Multiple clients accessing git over NFS 2010-11-14 21:24 Multiple clients accessing git over NFS Khawaja Shams @ 2010-11-14 23:11 ` Greg Troxel 2010-11-14 23:42 ` Khawaja Shams ` (2 more replies) 2010-11-15 1:26 ` Sitaram Chamarty 2010-11-16 13:47 ` Alex 2 siblings, 3 replies; 10+ messages in thread From: Greg Troxel @ 2010-11-14 23:11 UTC (permalink / raw) To: Khawaja Shams; +Cc: git [-- Attachment #1: Type: text/plain, Size: 1243 bytes --] Khawaja Shams <kshams@usc.edu> writes: > Is it a recommended practice to share a repository over NFS, where > multiple clients can be pushing changes simultaneously? In our > production environment, we have a Git repository setup behind > git-http-backend. We would like to place multiple Apache servers > behind a load balancer to maximize availability and performance. > Before we proceed, we wanted to check to see if this practice has a > potential to cause repository corruption. If there are other ways > others have solved this problem, we would be very interested in > learning about those as well. Thank you. NFS locking has historically been problematic, and my impression is that most people avoid it. Perhaps it's ok on Solaris, but without serious testing, I'd be worried. Can you explain what you have set up, and what your performance situation is, and why you think adding a second or third apache over NFS will help? How many users? How many pushes/day? One option is to have a multi-core box with tons of RAM running apache; I've done that for trac (8 core, 16G, RAID5) because trac/python is so piggy, and buying a $3K box was cheaper than making trac go faster. That doesn't get you into remote FS locking issues. [-- Attachment #2: Type: application/pgp-signature, Size: 194 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Multiple clients accessing git over NFS 2010-11-14 23:11 ` Greg Troxel @ 2010-11-14 23:42 ` Khawaja Shams 2010-11-15 0:32 ` Jonathan Nieder 2010-11-15 19:56 ` Jan Hudec [not found] ` <AANLkTinX-XR2TaZPGPeWyekMq3e8wEDkfcmi_o6pTvMK@mail.gmail.com> 2010-11-15 16:24 ` J. Bruce Fields 2 siblings, 2 replies; 10+ messages in thread From: Khawaja Shams @ 2010-11-14 23:42 UTC (permalink / raw) To: git Hi Greg, Thank you for the insightful response. We have multiple automated clients pushing and pulling changes from git as events occur. We have not hit any real performance issues just yet. Our main goal is to improve the availability of the repository in case the box running the apache server has an outage during a mission critical period. Any other ideas on how to accomplish this? From your remarks, it sounds like putting the git repository on NFS, even with a single client, can be problematic due to the locking issues. Is that what you meant? I am still interested in knowing if git can handle multiple simultaneous pushes on the same repository without encountering corruption issues. Thank you. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Multiple clients accessing git over NFS 2010-11-14 23:42 ` Khawaja Shams @ 2010-11-15 0:32 ` Jonathan Nieder 2010-11-15 19:56 ` Jan Hudec 1 sibling, 0 replies; 10+ messages in thread From: Jonathan Nieder @ 2010-11-15 0:32 UTC (permalink / raw) To: Khawaja Shams; +Cc: git Khawaja Shams wrote: > I am still interested in knowing if git can handle multiple > simultaneous pushes on the same repository without encountering > corruption issues. Yes, concurrent attempts to update a branch are serialized. (But please don't ask me to answer about NFS semantics. See http://stackoverflow.com/questions/750765/concurrency-in-a-git-repo-on-a-network-shared-folder for some notes.) See the note about fast-fowards in the git push manual for how integrity is preserved. After reading that, you might wonder: if there are many, many clients pushing to the same branch, how is starvation avoided? Good question! It isn't. If you have so many clients wanting to push to a single branch, I would suggest having a single person or a few people maintaining it, pulling from others. Life will be better for many reasons, especially quality control. Hope that helps. Jonathan ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Multiple clients accessing git over NFS 2010-11-14 23:42 ` Khawaja Shams 2010-11-15 0:32 ` Jonathan Nieder @ 2010-11-15 19:56 ` Jan Hudec 2010-11-15 20:44 ` Drew Northup 1 sibling, 1 reply; 10+ messages in thread From: Jan Hudec @ 2010-11-15 19:56 UTC (permalink / raw) To: Khawaja Shams; +Cc: git On Sun, Nov 14, 2010 at 15:42:29 -0800, Khawaja Shams wrote: > Hi Greg, > Thank you for the insightful response. We have multiple automated > clients pushing and pulling changes from git as events occur. We have > not hit any real performance issues just yet. Our main goal is to > improve the availability of the repository in case the box running the > apache server has an outage during a mission critical period. If you are out for availability, NFS isn't an answer, because the NFS server remains a single point of failure. There are distributed filesystems (Gluster, Lustre etc.) that can provide redundancy of storage nodes too or you could have shared storage array with appropriate filesystem (GlobalFS, OCFS2, etc.), but that requires special hardware. These will probably give you better performance too -- git network protocol is optimized to send minimal data, but that often means a lot more needs to be read from the disk. I don't have personal experience with them though, so I can't give you more specific recommendation. -- Jan 'Bulb' Hudec <bulb@ucw.cz> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Multiple clients accessing git over NFS 2010-11-15 19:56 ` Jan Hudec @ 2010-11-15 20:44 ` Drew Northup 0 siblings, 0 replies; 10+ messages in thread From: Drew Northup @ 2010-11-15 20:44 UTC (permalink / raw) To: Khawaja Shams; +Cc: git, Jan Hudec On Mon, 2010-11-15 at 20:56 +0100, Jan Hudec wrote: > On Sun, Nov 14, 2010 at 15:42:29 -0800, Khawaja Shams wrote: > > Hi Greg, > > Thank you for the insightful response. We have multiple automated > > clients pushing and pulling changes from git as events occur. We have > > not hit any real performance issues just yet. Our main goal is to > > improve the availability of the repository in case the box running the > > apache server has an outage during a mission critical period. > > If you are out for availability, NFS isn't an answer, because the NFS server > remains a single point of failure. There are distributed filesystems > (Gluster, Lustre etc.) that can provide redundancy of storage nodes too or > you could have shared storage array with appropriate filesystem (GlobalFS, > OCFS2, etc.), but that requires special hardware. These will probably give > you better performance too -- git network protocol is optimized to send > minimal data, but that often means a lot more needs to be read from the disk. > > I don't have personal experience with them though, so I can't give you more > specific recommendation. Khawaja, I haven't tried setting a server up with it yet, but perhaps DRDB mirrored devices may be of use? At that point then you have a way of making all of your HTTPd instances "see" the same filesystem (and will have notification options for when they do not). It probably isn't perfect, but may be worth looking into if your SAN cannot provide downtime-less NFS. As an added benefit, there is no longer a requirement that all of your front-ends be co-located (physically or logically). -- -Drew Northup N1XIM AKA RvnPhnx on OPN ________________________________________________ "As opposed to vegetable or mineral error?" -John Pescatore, SANS NewsBites Vol. 12 Num. 59 ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <AANLkTinX-XR2TaZPGPeWyekMq3e8wEDkfcmi_o6pTvMK@mail.gmail.com>]
* Re: Multiple clients accessing git over NFS [not found] ` <AANLkTinX-XR2TaZPGPeWyekMq3e8wEDkfcmi_o6pTvMK@mail.gmail.com> @ 2010-11-14 23:46 ` Greg Troxel 0 siblings, 0 replies; 10+ messages in thread From: Greg Troxel @ 2010-11-14 23:46 UTC (permalink / raw) To: Khawaja Shams; +Cc: git [-- Attachment #1: Type: text/plain, Size: 699 bytes --] If only one computer is accessing the repository, then failure to lock may be ok. But you'll still need atomic rename etc. to work. I may be overly conservative, but I would not (and do not) allow anyone to access a repository (cvs, svn, git, whatever) over NFS, ever. My expectation is that multiple git processes on one machine with a repo on local disk works fine. If not it's a bug. When you add a remote FS you have to wonder if the unix filesystem sematics are preserved. Another approach would be cloned repositories that constantly pull from the main one or each other, and have people use those. That will get you delayed merge conflicts, but they might be useful as RO replicas. [-- Attachment #2: Type: application/pgp-signature, Size: 194 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Multiple clients accessing git over NFS 2010-11-14 23:11 ` Greg Troxel 2010-11-14 23:42 ` Khawaja Shams [not found] ` <AANLkTinX-XR2TaZPGPeWyekMq3e8wEDkfcmi_o6pTvMK@mail.gmail.com> @ 2010-11-15 16:24 ` J. Bruce Fields 2 siblings, 0 replies; 10+ messages in thread From: J. Bruce Fields @ 2010-11-15 16:24 UTC (permalink / raw) To: Greg Troxel; +Cc: Khawaja Shams, git On Sun, Nov 14, 2010 at 06:11:41PM -0500, Greg Troxel wrote: > > Khawaja Shams <kshams@usc.edu> writes: > > > Is it a recommended practice to share a repository over NFS, where > > multiple clients can be pushing changes simultaneously? In our > > production environment, we have a Git repository setup behind > > git-http-backend. We would like to place multiple Apache servers > > behind a load balancer to maximize availability and performance. > > Before we proceed, we wanted to check to see if this practice has a > > potential to cause repository corruption. If there are other ways > > others have solved this problem, we would be very interested in > > learning about those as well. Thank you. > > NFS locking has historically been problematic, and my impression is that > most people avoid it. Perhaps it's ok on Solaris, but without serious > testing, I'd be worried. Does git actually do file locking when people push to a bare repo? If all it needs is for rename and/or O_EXCL to be atomic--that should be fine over NFS. --b. > > Can you explain what you have set up, and what your performance > situation is, and why you think adding a second or third apache over NFS > will help? How many users? How many pushes/day? > > One option is to have a multi-core box with tons of RAM running apache; > I've done that for trac (8 core, 16G, RAID5) because trac/python is so > piggy, and buying a $3K box was cheaper than making trac go faster. > That doesn't get you into remote FS locking issues. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Multiple clients accessing git over NFS 2010-11-14 21:24 Multiple clients accessing git over NFS Khawaja Shams 2010-11-14 23:11 ` Greg Troxel @ 2010-11-15 1:26 ` Sitaram Chamarty 2010-11-16 13:47 ` Alex 2 siblings, 0 replies; 10+ messages in thread From: Sitaram Chamarty @ 2010-11-15 1:26 UTC (permalink / raw) To: Khawaja Shams; +Cc: git On Mon, Nov 15, 2010 at 2:54 AM, Khawaja Shams <kshams@usc.edu> wrote: > Is it a recommended practice to share a repository over NFS, where > multiple clients can be pushing changes simultaneously? In our http://permalink.gmane.org/gmane.comp.version-control.git/122670 may be useful... > production environment, we have a Git repository setup behind > git-http-backend. We would like to place multiple Apache servers > behind a load balancer to maximize availability and performance. > Before we proceed, we wanted to check to see if this practice has a > potential to cause repository corruption. If there are other ways > others have solved this problem, we would be very interested in > learning about those as well. Thank you. > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Sitaram ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Multiple clients accessing git over NFS 2010-11-14 21:24 Multiple clients accessing git over NFS Khawaja Shams 2010-11-14 23:11 ` Greg Troxel 2010-11-15 1:26 ` Sitaram Chamarty @ 2010-11-16 13:47 ` Alex 2 siblings, 0 replies; 10+ messages in thread From: Alex @ 2010-11-16 13:47 UTC (permalink / raw) To: git Khawaja Shams <kshams <at> usc.edu> writes: > > Is it a recommended practice to share a repository over NFS, where > multiple clients can be pushing changes simultaneously? In our > production environment, we have a Git repository setup behind > git-http-backend. We would like to place multiple Apache servers > behind a load balancer to maximize availability and performance. > Before we proceed, we wanted to check to see if this practice has a > potential to cause repository corruption. If there are other ways > others have solved this problem, we would be very interested in > learning about those as well. Thank you. > Others have commented on the git aspects of this, but FYI there is a handy program here: http://www.unixcoding.org/NFSCoding#NFS_Cache_Tester that tests aspects of your NFS implementation. (Sadly the one we have at work is crap, or at least it was last time I ran the program). Alex ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2010-11-16 13:47 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-11-14 21:24 Multiple clients accessing git over NFS Khawaja Shams 2010-11-14 23:11 ` Greg Troxel 2010-11-14 23:42 ` Khawaja Shams 2010-11-15 0:32 ` Jonathan Nieder 2010-11-15 19:56 ` Jan Hudec 2010-11-15 20:44 ` Drew Northup [not found] ` <AANLkTinX-XR2TaZPGPeWyekMq3e8wEDkfcmi_o6pTvMK@mail.gmail.com> 2010-11-14 23:46 ` Greg Troxel 2010-11-15 16:24 ` J. Bruce Fields 2010-11-15 1:26 ` Sitaram Chamarty 2010-11-16 13:47 ` Alex
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).