git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Corporate firewall braindamage
@ 2008-04-10 21:11 H. Peter Anvin
  2008-04-10 23:14 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: H. Peter Anvin @ 2008-04-10 21:11 UTC (permalink / raw)
  To: Git Mailing List; +Cc: ftpadmin

The apparent commonality of corporate firewall braindamage, and the 
resulting "need" of people to pull over dumb (http) transport, is an 
ongoing problem on kernel.org.

I have thought some about what can be done to improve the situation, and 
I have come up with the following list of possibilities, pretty much 
listed in order from easiest and least generic to hardest and most generic.

It would be very interesting if people who have familiarity with this 
particular class of braindamaged firewalls could comment on how many 
users would be helped by which ones of these solutions.


1. git protocol via CONNECT http proxy

    Connect to http proxy, and use a CONNECT method to establish a link
    to the git server, using the normal git protocol.

    Minor change to TCP connection setup, but no other changes needed.
    No changes on the server side.


2. git protocol over SSL via CONNECT http proxy

    Same as #1, but encapsulate the data stream in an SSL connection.
    If the git server is run on port 443, then the fact that the data
    on the SSL connection isn't actually HTTP should be invisible to the
    proxy, and thus this *should* work anywhere which allows https://
    traffic.

    Requires the git server to speak SSL.


3. git protocol encapsulated in HTTP POST transaction

    git protocol is already fundamentally a RPC protocol, where the
    client sends a query and the server responds.  Furthermore, it
    tries to minimize the number of round trips (RPC calls), which is
    of course desirable.

    Each such RPC transaction could be formulated as an HTTP POST
    transaction.

    This requires modifications to both the client and the server;
    furthermore, the server can no longer rely on the invariant "one TCP
    connection == one session"; a proxy might break a single session
    into arbitrarily many TCP connections.

Thoughts?

	-hpa

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Corporate firewall braindamage
  2008-04-10 21:11 Corporate firewall braindamage H. Peter Anvin
@ 2008-04-10 23:14 ` Junio C Hamano
  2008-04-10 23:33   ` Shawn O. Pearce
  2008-04-11  8:25   ` david
  0 siblings, 2 replies; 6+ messages in thread
From: Junio C Hamano @ 2008-04-10 23:14 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Git Mailing List, ftpadmin

"H. Peter Anvin" <hpa@zytor.com> writes:

> 1. git protocol via CONNECT http proxy
>
>    Connect to http proxy, and use a CONNECT method to establish a link
>    to the git server, using the normal git protocol.
>
>    Minor change to TCP connection setup, but no other changes needed.
>    No changes on the server side.

Many firewalls will detect that CONNECT will not going to 443 and block
you, and even if you run git:// daemon on 443, they will detect that you
are not talking SSL initial exchange and shut you off.

> 2. git protocol over SSL via CONNECT http proxy
>
>    Same as #1, but encapsulate the data stream in an SSL connection.
>    If the git server is run on port 443, then the fact that the data
>    on the SSL connection isn't actually HTTP should be invisible to the
>    proxy, and thus this *should* work anywhere which allows https://
>    traffic.
>
>    Requires the git server to speak SSL.

Yes, perhaps putting it behind an independent ssl relay would give you a
solution without any code change.

> 3. git protocol encapsulated in HTTP POST transaction
>
>    git protocol is already fundamentally a RPC protocol, where the
>    client sends a query and the server responds.  Furthermore, it
>    tries to minimize the number of round trips (RPC calls), which is
>    of course desirable.
>
>    Each such RPC transaction could be formulated as an HTTP POST
>    transaction.
>
>    This requires modifications to both the client and the server;
>    furthermore, the server can no longer rely on the invariant "one TCP
>    connection == one session"; a proxy might break a single session
>    into arbitrarily many TCP connections.

It would probably be a one-CS/EE-student-half-a-summer sized project to
create such a server-side support with a specialized client.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Corporate firewall braindamage
  2008-04-10 23:14 ` Junio C Hamano
@ 2008-04-10 23:33   ` Shawn O. Pearce
  2008-04-10 23:50     ` H. Peter Anvin
  2008-04-11  8:25   ` david
  1 sibling, 1 reply; 6+ messages in thread
From: Shawn O. Pearce @ 2008-04-10 23:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: H. Peter Anvin, Git Mailing List, ftpadmin

Junio C Hamano <gitster@pobox.com> wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
> > 3. git protocol encapsulated in HTTP POST transaction
> >
> >    git protocol is already fundamentally a RPC protocol, where the
> >    client sends a query and the server responds.  Furthermore, it
> >    tries to minimize the number of round trips (RPC calls), which is
> >    of course desirable.
> >
> >    Each such RPC transaction could be formulated as an HTTP POST
> >    transaction.
> >
> >    This requires modifications to both the client and the server;
> >    furthermore, the server can no longer rely on the invariant "one TCP
> >    connection == one session"; a proxy might break a single session
> >    into arbitrarily many TCP connections.
> 
> It would probably be a one-CS/EE-student-half-a-summer sized project to
> create such a server-side support with a specialized client.

Funny you say that.  This was a GSoC 2008 project idea.  We even
received an application from a student for it.

The hard part is either making the server side stateful, so it can
remember what the last RCP call had said it wants/haves, or doing a
stateless protocol where the client uses an exponential expansion
(or some such behavior) of its have list until the server replies
with the pack data.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Corporate firewall braindamage
  2008-04-10 23:33   ` Shawn O. Pearce
@ 2008-04-10 23:50     ` H. Peter Anvin
  2008-04-11  1:03       ` H. Peter Anvin
  0 siblings, 1 reply; 6+ messages in thread
From: H. Peter Anvin @ 2008-04-10 23:50 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, Git Mailing List, ftpadmin

Shawn O. Pearce wrote:
> 
> Funny you say that.  This was a GSoC 2008 project idea.  We even
> received an application from a student for it.
> 
> The hard part is either making the server side stateful, so it can
> remember what the last RCP call had said it wants/haves, or doing a
> stateless protocol where the client uses an exponential expansion
> (or some such behavior) of its have list until the server replies
> with the pack data.
> 

One easy way of doing the former is to have a session reassociator in 
the flow; pretty much a multiplexer which receives the HTTP request, and 
passes it onto a work slave (which can be an ordinary process, in fact, 
can be the ordinary git daemon) based on a session and sequence ID.

	-hpa

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Corporate firewall braindamage
  2008-04-10 23:50     ` H. Peter Anvin
@ 2008-04-11  1:03       ` H. Peter Anvin
  0 siblings, 0 replies; 6+ messages in thread
From: H. Peter Anvin @ 2008-04-11  1:03 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, Git Mailing List, ftpadmin

H. Peter Anvin wrote:
> Shawn O. Pearce wrote:
>>
>> Funny you say that.  This was a GSoC 2008 project idea.  We even
>> received an application from a student for it.
>>
>> The hard part is either making the server side stateful, so it can
>> remember what the last RCP call had said it wants/haves, or doing a
>> stateless protocol where the client uses an exponential expansion
>> (or some such behavior) of its have list until the server replies
>> with the pack data.
>>
> 
> One easy way of doing the former is to have a session reassociator in 
> the flow; pretty much a multiplexer which receives the HTTP request, and 
> passes it onto a work slave (which can be an ordinary process, in fact, 
> can be the ordinary git daemon) based on a session and sequence ID.
> 

s/multiplexer/demultiplexer/

The best might be to turn the demultiplexer either into an Apache module 
or some scripting language which can run inside Apache (e.g. mod_perl) 
to avoid Apache spawning a CGI program which is only used to talk to the 
git daemon backend.

	-hpa

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Corporate firewall braindamage
  2008-04-10 23:14 ` Junio C Hamano
  2008-04-10 23:33   ` Shawn O. Pearce
@ 2008-04-11  8:25   ` david
  1 sibling, 0 replies; 6+ messages in thread
From: david @ 2008-04-11  8:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: H. Peter Anvin, Git Mailing List, ftpadmin

On Thu, 10 Apr 2008, Junio C Hamano wrote:

> "H. Peter Anvin" <hpa@zytor.com> writes:
>
>> 1. git protocol via CONNECT http proxy
>>
>>    Connect to http proxy, and use a CONNECT method to establish a link
>>    to the git server, using the normal git protocol.
>>
>>    Minor change to TCP connection setup, but no other changes needed.
>>    No changes on the server side.
>
> Many firewalls will detect that CONNECT will not going to 443 and block
> you, and even if you run git:// daemon on 443, they will detect that you
> are not talking SSL initial exchange and shut you off.
>
>> 2. git protocol over SSL via CONNECT http proxy
>>
>>    Same as #1, but encapsulate the data stream in an SSL connection.
>>    If the git server is run on port 443, then the fact that the data
>>    on the SSL connection isn't actually HTTP should be invisible to the
>>    proxy, and thus this *should* work anywhere which allows https://
>>    traffic.
>>
>>    Requires the git server to speak SSL.
>
> Yes, perhaps putting it behind an independent ssl relay would give you a
> solution without any code change.

in more pananoid locations they are putting client certs on desktops and 
giving those to the IDS systems so that they can decrypt the SSL traffic, 
so if it doesn't look like HTTP inside the SSL they will block it.

this isn't very common now, but the firewalls that are blocking #1 weren't 
very common a year or so ago either.

>> 3. git protocol encapsulated in HTTP POST transaction
>>
>>    git protocol is already fundamentally a RPC protocol, where the
>>    client sends a query and the server responds.  Furthermore, it
>>    tries to minimize the number of round trips (RPC calls), which is
>>    of course desirable.
>>
>>    Each such RPC transaction could be formulated as an HTTP POST
>>    transaction.
>>
>>    This requires modifications to both the client and the server;
>>    furthermore, the server can no longer rely on the invariant "one TCP
>>    connection == one session"; a proxy might break a single session
>>    into arbitrarily many TCP connections.
>
> It would probably be a one-CS/EE-student-half-a-summer sized project to
> create such a server-side support with a specialized client.

this is probably the best long-term option.

David Lang

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-04-11  8:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-10 21:11 Corporate firewall braindamage H. Peter Anvin
2008-04-10 23:14 ` Junio C Hamano
2008-04-10 23:33   ` Shawn O. Pearce
2008-04-10 23:50     ` H. Peter Anvin
2008-04-11  1:03       ` H. Peter Anvin
2008-04-11  8:25   ` david

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).