git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Martin Ågren" <martin.agren@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Andreas Schwab <schwab@suse.de>,
	Lars Schneider <larsxschneider@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Marc Stevens <marc@marc-stevens.nl>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: Unaligned accesses in sha1dc
Date: Fri, 2 Jun 2017 22:11:25 +0200	[thread overview]
Message-ID: <CAN0heSoV3nPO1v=CWze4DfpnmBn7+wVLm3_4f4ouv+PSGMAd+w@mail.gmail.com> (raw)
In-Reply-To: <CACBZZX5re5Ge1OzxYOE42nwBhhusya6=M9An08X-KzaqNH9Cog@mail.gmail.com>

On 2 June 2017 at 21:32, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> On Fri, Jun 2, 2017 at 11:49 AM, Martin Ågren <martin.agren@gmail.com> wrote:
>> On 2 June 2017 at 10:51, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>>> On Fri, Jun 2, 2017 at 2:15 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>>> Martin Ågren <martin.agren@gmail.com> writes:
>>>>
>>>>> I looked into this some more. It turns out it is possible to trigger
>>>>> undefined behavior on "next". Here's what I did:
>>>>> ...
>>>>>
>>>>> This "fixes" the problem:
>>>>> ...
>>>>> diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
>>>>> index 3dff80a..d6f4c44 100644
>>>>> --- a/sha1dc/sha1.c
>>>>> +++ b/sha1dc/sha1.c
>>>>> @@ -66,9 +66,9 @@
>>>>> ...
>>>>> With this diff, various tests which seem relevant for SHA-1 pass,
>>>>> including t0013, and the UBSan-error is gone. The second diff is just
>>>>> a monkey-patch. I have no reason to believe I will be able to come up
>>>>> with a proper and complete patch for sha1dc. And I guess such a thing
>>>>> would not really be Git's patch to carry, either. But at least Git
>>>>> could consider whether to keep relying on undefined behavior or not.
>>>>>
>>>>> There's a fair chance I've mangled the whitespace. I'm using gmail's
>>>>> web interface... Sorry about that.
>>>>
>>>> Thanks.  I see Marc Stevens is CC'ed in the thread, so I'd expect
>>>> that the final "fix" would come from his sha1collisiondetection
>>>> repository via Ævar.
>>>>
>>>> In the meantime, I am wondering if it makes sense to merge the
>>>> earlier update with #ifdef ALLOW_UNALIGNED_ACCESS and #ifdef
>>>> SHA1DC_FORCE_LITTLEENDIAN for the v2.13.x maintenance track, which
>>>> would at least unblock those on platforms v2.13.0 did not work
>>>> correctly at all.
>>>>
>>>> Ævar, thoughts?
>>>
>>> I think we're mixing up several things here, which need to be untangled:
>>>
>>> 1) The sha1dc works just fine on most platforms even with undefined
>>> behavior, as evidenced by 2.13.0 working.
>>
>> Right, with "platform" meaning "combination of hardware-architecture
>> and compiler". Nothing can be said about how the current code behaves
>> on "x86". Such statements can only be made with regard to "x86 and
>> this or that compiler". Even then, short of studying the compiler
>> implementation/documentation in detail, one cannot be certain that
>> seemingly unrelated changes in Git don't make the code do something
>> else entirely.
>
> I think you're veering into a theoretical discussion here that has
> little to no bearing on the practicalities involved here.
>
> Yes if something is undefined behavior in C the compiler &
> architecture is free to do anything they want with it. In practice
> lots of undefined behavior is de-facto standardized across various
> platforms.
>
> As far as I can tell unaligned access is one of those things. I don't
> think there's ever been an x86 chip / compiler that would run this
> code with any semantic differences when it comes to unaligned access,
> and such a chip / compiler is unlikely to ever exist.
>
> I'm not advocating that we rely on undefined behavior willy-nilly,
> just that we should consider the real situation is (i.e. what actual
> architectures / compilers are doing or are likely to do) as opposed to
> the purely theoretical (if you gave a bunch of aliens who'd never
> heard of our technology the ANSI C standard to implement from
> scratch).

Yeah, that's an argument. I just thought I'd provide whatever input I
could, albeit in text form. The only thing that matters in the end is
that you (the Git project) feel that you make the correct decision,
possibly going beyond "theoretical" reasoning into engineering-land.

Take care,
Martin

  reply	other threads:[~2017-06-02 20:11 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-01  8:28 Unaligned accesses in sha1dc Andreas Schwab
     [not found] ` <CDB32E2C-48AF-4636-B921-4C45B614FD35@marc-stevens.nl>
2017-06-01  9:05   ` Andreas Schwab
2017-06-01  9:15 ` Junio C Hamano
2017-06-01  9:15   ` Andreas Schwab
2017-06-01  9:34     ` Junio C Hamano
2017-06-01  9:18 ` brian m. carlson
2017-06-01  9:32   ` Andreas Schwab
2017-06-01  9:21 ` Lars Schneider
2017-06-01  9:53   ` Junio C Hamano
2017-06-01 10:08     ` Andreas Schwab
2017-06-01 10:26       ` Martin Ågren
2017-06-01 10:33         ` Ævar Arnfjörð Bjarmason
2017-06-01 11:53           ` Martin Ågren
2017-06-01 15:57             ` Martin Ågren
2017-06-02  0:15               ` Junio C Hamano
2017-06-02  8:51                 ` Ævar Arnfjörð Bjarmason
2017-06-02  9:49                   ` Martin Ågren
2017-06-02 19:32                     ` Ævar Arnfjörð Bjarmason
2017-06-02 20:11                       ` Martin Ågren [this message]
2017-06-02 20:14                         ` Ævar Arnfjörð Bjarmason
2017-06-02 20:25                           ` demerphq
2017-06-02 20:17                       ` demerphq
2017-06-02 20:38                         ` Ævar Arnfjörð Bjarmason
2017-06-02 21:53                         ` Linus Torvalds
2017-06-03  0:13                           ` Junio C Hamano
2017-06-02 14:46                   ` Liam R. Howlett
2017-06-02 16:53                     ` Ævar Arnfjörð Bjarmason
2017-06-03  0:15                       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAN0heSoV3nPO1v=CWze4DfpnmBn7+wVLm3_4f4ouv+PSGMAd+w@mail.gmail.com' \
    --to=martin.agren@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=larsxschneider@gmail.com \
    --cc=marc@marc-stevens.nl \
    --cc=schwab@suse.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).