All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Oberndorfer <kumbayo84@arcor.de>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: crash on git diff-tree -Ganything <tree> for new files with textconv filter
Date: Sun, 28 Oct 2012 20:56:39 +0100	[thread overview]
Message-ID: <508D8DF7.7040007@arcor.de> (raw)
In-Reply-To: <20121028120104.GE11434@sigill.intra.peff.net>

On 2012-10-28 13:01, Jeff King wrote:
> On Sat, Oct 27, 2012 at 08:37:24PM +0200, Peter Oberndorfer wrote:
>
>> It seems "git diff-tree -Ganything <tree>" crashes[1] with a null
>> pointer dereference
>> when run on a commit that adds a file (pdf) with a textconv filter.
>>
>> It can be reproduced with vanilla git by having a commit on top that
>> adds a file with a textconv filter and executing git diff-tree
>> -Ganything HEAD
>> But running git log -Ganything still works without a crash.
>> This problem seems to exist since the feature was first added in f506b8e8b5.
> Thanks for a thorough bug report. I didn't reproduce the crash, but I
> can see how it happens (it happens with diff-tree because we will reuse
> the working tree file in that instance; it could also happen if you
> turned on textconv caching).
>
>> While testing I also noticed the -S and -G act on the original file
>> instead of the textconv munged data.
>> Is this intentional or caused by accessing the wrong data?
> Both, perhaps. :)
Hi,
thanks for your patch for this!

>
> -G operates on the munged data; you can see it feed the munged data to
> xdiff in diff_grep. But the optimization for handling added and removed
> files accidentally fed the wrong pointer. Fixing that is a no-brainer,
> since the optimization is inconsistent with the regular code path.
>
> -S, however, predates the invention of textconv and has never used it.
> It is a little less clear that textconv is the right thing here, because
> it is not about grepping the diff, but about counting occurrences of the
> string inside the file content. I tend to think that doing so on the
> textconv'd data would be what people generally want, but it is a
> behavior change.
>
>> Wild guess: should we really access p->one->data and not mf1.ptr?
> Precisely. Thanks for your wild guess; it made finding the bug very
> easy. :)
>
>> Is there some more information i should provide?
> The patch below should fix it. I added tests, but please try your
> real-world test case on it to double-check.

I tested your patch, but now it crashes for another reason :-)
i have a file with exactly 12288(0x3000) bytes in the repository.
When the file is loaded, the data is placed luckily so the data end
falls at a page boundary.
Later diff_grep() calls regexec() which calls strlen() on the loaded buffer
and ends up reading beyond the actual data into the next page
which is not allocated and causes a pagefault.
Or it could possibly (randomly) match the regex on data that is not
actually part of a file...
Different memory allocation rules on Windows probably also have some
influence here.

My guess is that diff_filespec->data is not supposed to be zero terminated
and we should not invoke strlen() on a not zero terminated data.
But this should be decided by somebody who knows the rules.

Greetings Peter

  parent reply	other threads:[~2012-10-28 19:57 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-27 18:37 crash on git diff-tree -Ganything <tree> for new files with textconv filter Peter Oberndorfer
2012-10-28 12:01 ` Jeff King
2012-10-28 12:45   ` [PATCH 0/2] textconv support for "log -S" Jeff King
2012-10-28 12:46     ` [PATCH 1/2] pickaxe: hoist empty needle check Jeff King
2012-10-28 12:47     ` [PATCH 2/2] pickaxe: use textconv for -S counting Jeff King
2012-11-13 23:13       ` Junio C Hamano
2012-11-15  1:21         ` Jeff King
2012-11-20  0:31           ` Junio C Hamano
2012-11-20  0:48             ` Junio C Hamano
2012-11-21 20:27               ` Jeff King
2012-10-28 19:56   ` Peter Oberndorfer [this message]
2012-10-29  6:05     ` crash on git diff-tree -Ganything <tree> for new files with textconv filter Jeff King
2012-10-29  6:18       ` Jeff King
2012-10-29 20:19       ` Peter Oberndorfer
2012-10-29 22:35         ` Jeff King
2012-10-29 22:47           ` Jeff King
2012-10-30 12:17             ` Jeff King
2012-10-30 12:46               ` Junio C Hamano
2012-10-30 13:12                 ` Jeff King
2012-11-01 19:19               ` Ramsay Jones
2012-11-07 21:10           ` Peter Oberndorfer
2012-11-07 21:13             ` Jeff King
2013-06-03 17:25               ` Peter Oberndorfer
2013-06-03 22:17                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=508D8DF7.7040007@arcor.de \
    --to=kumbayo84@arcor.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.