git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Use alternate GIT servers to share traffic
@ 2009-03-25 20:45 Thomas Koch
  2009-03-25 22:57 ` Johannes Schindelin
  2009-03-26 13:40 ` Samuel Lucas Vaz de Mello
  0 siblings, 2 replies; 4+ messages in thread
From: Thomas Koch @ 2009-03-25 20:45 UTC (permalink / raw)
  To: git

Hi,

we host a public GIT repository on our high availability company
cluster. Cloning the repo causes a trafic volume of 326 MB. We'd like to
avoid that much trafic while still leaving the GIT repo where it is.

I could imagine the following conversation between the GIT client and
server:

Client: Wanna clone!
Server: You're welcome. Please note, that while I serve the most current
state, you can get objects much faster from my collegue Server
CHEAPHOST.
Client: Thank you. Will take all the objects I can get from CHEAPHOST
and come back if I should need anything else!

The enduser should not need to specify anything, but only the regular
git clone EXPENSIVEHOST line.

Your thoughts?

Best regards,
-- 
Thomas Koch, http://www.koch.ro
YMC AG, http://www.ymc.ch

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Use alternate GIT servers to share traffic
  2009-03-25 20:45 Use alternate GIT servers to share traffic Thomas Koch
@ 2009-03-25 22:57 ` Johannes Schindelin
  2009-03-26  4:30   ` Andrew Wang
  2009-03-26 13:40 ` Samuel Lucas Vaz de Mello
  1 sibling, 1 reply; 4+ messages in thread
From: Johannes Schindelin @ 2009-03-25 22:57 UTC (permalink / raw)
  To: Thomas Koch; +Cc: git

Hi,

On Wed, 25 Mar 2009, Thomas Koch wrote:

> we host a public GIT repository on our high availability company 
> cluster. Cloning the repo causes a trafic volume of 326 MB. We'd like to 
> avoid that much trafic while still leaving the GIT repo where it is.
> 
> I could imagine the following conversation between the GIT client and
> server:
> 
> Client: Wanna clone!
> Server: You're welcome. Please note, that while I serve the most current
> state, you can get objects much faster from my collegue Server
> CHEAPHOST.
> Client: Thank you. Will take all the objects I can get from CHEAPHOST
> and come back if I should need anything else!
> 
> The enduser should not need to specify anything, but only the regular
> git clone EXPENSIVEHOST line.
> 
> Your thoughts?

That sounds a lot like the mirror-sync idea.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Use alternate GIT servers to share traffic
  2009-03-25 22:57 ` Johannes Schindelin
@ 2009-03-26  4:30   ` Andrew Wang
  0 siblings, 0 replies; 4+ messages in thread
From: Andrew Wang @ 2009-03-26  4:30 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Thomas Koch, git

On Wed, Mar 25, 2009 at 6:57 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Wed, 25 Mar 2009, Thomas Koch wrote:
>
>> we host a public GIT repository on our high availability company
>> cluster. Cloning the repo causes a trafic volume of 326 MB. We'd like to
>> avoid that much trafic while still leaving the GIT repo where it is.
>>
>> I could imagine the following conversation between the GIT client and
>> server:
>>
>> Client: Wanna clone!
>> Server: You're welcome. Please note, that while I serve the most current
>> state, you can get objects much faster from my collegue Server
>> CHEAPHOST.
>> Client: Thank you. Will take all the objects I can get from CHEAPHOST
>> and come back if I should need anything else!
>>
>> The enduser should not need to specify anything, but only the regular
>> git clone EXPENSIVEHOST line.
>>
>> Your thoughts?
>
> That sounds a lot like the mirror-sync idea.
>
> Ciao,
> Dscho
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Yea, that would definitely fall under mirror-sync functionality. I
wrote up my GSoC proposal for implementing this to the list
(http://marc.info/?l=git&m=123795365411979&w=2), comments and
criticism welcome.

Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Use alternate GIT servers to share traffic
  2009-03-25 20:45 Use alternate GIT servers to share traffic Thomas Koch
  2009-03-25 22:57 ` Johannes Schindelin
@ 2009-03-26 13:40 ` Samuel Lucas Vaz de Mello
  1 sibling, 0 replies; 4+ messages in thread
From: Samuel Lucas Vaz de Mello @ 2009-03-26 13:40 UTC (permalink / raw)
  To: Thomas Koch; +Cc: git

Thomas Koch wrote:
> Hi,
> 
> we host a public GIT repository on our high availability company
> cluster. Cloning the repo causes a trafic volume of 326 MB. We'd like to
> avoid that much trafic while still leaving the GIT repo where it is.
> 
> I could imagine the following conversation between the GIT client and
> server:
> 
> Client: Wanna clone!
> Server: You're welcome. Please note, that while I serve the most current
> state, you can get objects much faster from my collegue Server
> CHEAPHOST.
> Client: Thank you. Will take all the objects I can get from CHEAPHOST
> and come back if I should need anything else!
> 
> The enduser should not need to specify anything, but only the regular
> git clone EXPENSIVEHOST line.
> 
> Your thoughts?
> 

I have a scenario here that is (nearly) similar to what you want. 
We have two development sites and we let users choose the server that is closer to them.
As all changes to these repositories are made using push, we use post-receive hooks to syncronize.
User can push no any of the servers and the changes will get replicated.

In the EXPENSIVEHOST you add the CHEAPHOST as remote and put a 'git push --mirror cheaphost' in the post-receive hook.

In the CHEAPHOST, you add EXPENSIVEHOST as remote and changes git config to make it put the references in refs/heads/* instead of refs/remotes/expensivehost/*.  In the post-receive hook you add a 'git push --all expensivehost'.

Also, you need to ensure that all users can authenticate in both servers (or, in my case, I made the hook use sudo to push the updates using a special user that authenticate using ssh keys).

The drawback ares:

1) User must manually choose the closest server. (maybe some sort of round-robin DNS would do it automatically?)

2) Branch and tag deletion must be done in the EXPENSIVEHOST.

3) EXPENSIVEHOST stores the remote refs from CHEAPHOST in remotes/cheaphost/* and they are pushed back to CHEAPHOST by push --mirror. These references are not used at all, but they can cause some noise in the log message during pushes.

4) If we have two users committing to the same branch exactly at the same time in the different servers, I'm not sure about what will happen :-). As precaution, I added in CHEAPHOST a cron job that does a 'git remote update' in the repo. So, if the servers became inconsistent, it will perform a forced update from EXPENSIVEHOST to CHEAPHOST. 

I have the scenario running just for a few days, so there may be some additional corner cases.

HTH,

 - Samuel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-03-26 13:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-25 20:45 Use alternate GIT servers to share traffic Thomas Koch
2009-03-25 22:57 ` Johannes Schindelin
2009-03-26  4:30   ` Andrew Wang
2009-03-26 13:40 ` Samuel Lucas Vaz de Mello

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).