git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Reece Dunn" <msclrhd@googlemail.com>
To: "Johan Herland" <johan@herland.net>,
	git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Timo Sirainen" <tss@iki.fi>,
	"Linus Torvalds" <torvalds@linux-foundation.org>
Subject: Re: Buffer overflows
Date: Sun, 2 Sep 2007 16:11:32 +0100	[thread overview]
Message-ID: <3f4fd2640709020811r4ea8f01fw775257859e26af29@mail.gmail.com> (raw)
In-Reply-To: <200709021542.31100.johan@herland.net>

On 02/09/07, Johan Herland <johan@herland.net> wrote:
> On Friday 31 August 2007, Junio C Hamano wrote:
> > The API needs to justify itself to convince the people who needs
> > to learn and adjust to that the benefit far outweighes deviation
> > from better known patterns, and I do not see that happening in
> > Timo's patch.
>
> So in general, git people seem to be saying that:
>
> 1. Yes, we agree that the C string library suX0rs badly.
>
> 2. There are more than 0 string manipulation bugs (e.g. buffer overflows) in
> git. The number may be small or large, but I have yet to see anyone claim
> it's _zero_.
>
> 3. Timo's patches (in their current form) are not the way to go, because of
> non-standard API, implementation problems, whatever...
>
> So why does the discussion end there? Lukas proposed an interesting
> alternative in "The Better String Library" (
> http://bstring.sourceforge.net/ ). Why has there been lots of bashing on
> Timo's efforts, but no critique of bstring? I'd be very keen to know what
> the git developers think of it. AFAICS, it seems to fulfill at least _some_
> of the problems people find in Timo's patches. Specifically, it claims:
>
> - High performance (better than the C string library)
> - Simple usage

Performing a brief look at the documentation, the bstring library
looks promising.

It looks like it has an allocate and grow internal buffers on demand
policy. This is similar to what the C++ std::basic_string does, as
well as the string helpers in the Boost version of Jam (written in C).
This hides the buffer management from the user of the library, rather
than obfuscating it like in Timo's patch.

The API defined in the documentation is well thought out and
extensive, moreso than in the efforts by Timo and others. It has the
traditional C API, along with other API found in other string
libraries (such as split and join). I am not sure how much of git
could make use of these, but they have the pontential to simplify some
areas of the codebase.

Looking at the documentation, it is clear that this is a well thought
out library, both from the problems/security issues of the C library
and to how it compares with other string libraries. As well as
covering buffer overflow, it also deals with things like integer
overflow.

They have also done performance tests comparing the bstring library to
the C API and C++ std::string. With the C API comparison, the library
performs about 10% slower for string assignment, but other areas don't
have a slowdown. In fact, string concatenation is _considerably_
improved, something that will help git performance. I suspect (but
have not verified) that the slowdown on assignment is due to buffer
allocation.

> I'd also say it's probably more widely used than Timo's patches.

Which is good, as this means that along with the tests in the library,
it will be more stable and less likely to be buggy than something that
is written from scratch.

> If the only response to Timo's highlighting of string manipulation problems
> in git, is for us to flame his patches and leave it at that, then I have no
> choice but to agree with him in that security does not seem to matter to
> us.

I would not like to see that happen. It seems that the bstring library
will help git in more ways than security, by improving string
concatenation performance and giving a richer string API without
sacrificing performance (except where noted) and code clarity.

It would be interesting to see how the 10% performance drop on string
assignment impacts git performance, when balanced with the drastic
(92x in the performance table) increase on string concatenation.

The only major issue that I can see with bstring is that it does not
have a wchar_t version, but git is using chars internally, so this is
not a problem for git.

- Reece

  reply	other threads:[~2007-09-02 15:11 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-30 19:26 Buffer overflows Timo Sirainen
2007-08-30 20:26 ` Lukas Sandström
2007-08-30 20:46 ` Linus Torvalds
2007-08-30 21:08   ` Timo Sirainen
2007-08-30 21:35     ` Reece Dunn
2007-08-30 21:51       ` Timo Sirainen
2007-08-30 22:34         ` Reece Dunn
2007-08-31 10:52           ` Wincent Colaiuta
2007-08-31 12:48             ` Simon 'corecode' Schubert
2007-08-30 22:14       ` Junio C Hamano
2007-08-30 22:36         ` Pierre Habouzit
2007-08-30 22:41         ` Timo Sirainen
2007-09-02 13:42         ` Johan Herland
2007-09-02 15:11           ` Reece Dunn [this message]
2007-09-02 15:19             ` David Kastrup
2007-09-02 15:35               ` Reece Dunn
2007-09-03  0:19               ` Jakub Narebski
2007-09-03  0:31                 ` Junio C Hamano
2007-09-02 17:17           ` René Scharfe
2007-09-02 17:39             ` Lukas Sandström
2007-08-31  4:09     ` Linus Torvalds
2007-08-31  5:00       ` Timo Sirainen
2007-08-31  9:53         ` Andreas Ericsson
2007-08-31 10:06         ` Johannes Schindelin
2007-08-30 21:48 ` [PATCH] Temporary fix for stack smashing in mailinfo Alex Riesen
2007-08-30 22:53   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f4fd2640709020811r4ea8f01fw775257859e26af29@mail.gmail.com \
    --to=msclrhd@googlemail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johan@herland.net \
    --cc=torvalds@linux-foundation.org \
    --cc=tss@iki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).