From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 109007] radeonsi cache format changed, causes mesa crash on startup Date: Tue, 11 Dec 2018 07:48:30 +0000 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0482995262==" Return-path: Received: from culpepper.freedesktop.org (culpepper.freedesktop.org [131.252.210.165]) by gabe.freedesktop.org (Postfix) with ESMTP id 0827689C6E for ; Tue, 11 Dec 2018 07:48:31 +0000 (UTC) List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============0482995262== Content-Type: multipart/alternative; boundary="15445145100.E8Be14Ab.16956" Content-Transfer-Encoding: 7bit --15445145100.E8Be14Ab.16956 Date: Tue, 11 Dec 2018 07:48:30 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated https://bugs.freedesktop.org/show_bug.cgi?id=3D109007 Bug ID: 109007 Summary: radeonsi cache format changed, causes mesa crash on startup Product: Mesa Version: git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/radeonsi Assignee: dri-devel@lists.freedesktop.org Reporter: dan@reactivated.net QA Contact: dri-devel@lists.freedesktop.org Having upgraded from Mesa-17.3 to Mesa-18.1 in Endless OS, many users on AMD-based platforms are now reporting that the system fails to boot into the UI. I've reproduced and confirm that Xorg is crashing very early on. Thread 4 "si_shader:0" received signal SIGSEGV, Segmentation fault. __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1= 533 Backtrace: #0 __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1533 #1 0x00007fffeeba2038 in memcpy (__len=3D3221880836, __src=3D0x7fffe4000e7= 0,=20 __dest=3D) at /usr/include/x86_64-linux-gnu/bits/string3= .h:53 #2 read_data (size=3D3221880836, data=3D, ptr=3D0x7fffe4000= e70) at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:95 #3 read_chunk (ptr=3D0x7fffe4000e70, ptr@entry=3D0x7fffe4000e6c,=20 data=3Ddata@entry=3D0x7fffe4000998, size=3Dsize@entry=3D0x7fffe4000980) at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:121 #4 0x00007fffeeba21b3 in si_load_shader_binary ( shader=3Dshader@entry=3D0x7fffe40008c0, binary=3Dbinary@entry=3D0x7fffe= 4000e00) at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:187 #5 0x00007fffeeba4810 in si_shader_cache_load_shader (shader=3D0x7fffe4000= 8c0,=20 ir_binary=3D0x7fffe4000a50, sscreen=3D0x555555a393a0) at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:275 #6 si_init_shader_selector_async (job=3Djob@entry=3D0x555555b8dfa0,=20 thread_index=3Dthread_index@entry=3D0) at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:1875 #7 0x00007fffee747a55 in util_queue_thread_func ( input=3Dinput@entry=3D0x555555a39fb0) at ../../../src/util/u_queue.c:271 #8 0x00007fffee7476c7 in impl_thrd_routine (p=3D) at ../../../include/c11/threads_posix.h:87 #9 0x00007ffff574d494 in start_thread (arg=3D0x7fffebe06700) The problem here is that the on-disk radeonsi cache format changed without consideration for this in the code. The affected codepath is si_load_shader_binary() which does: uint32_t size =3D *ptr++; uint32_t crc32 =3D *ptr++; [...] ptr =3D read_data(ptr, &shader->config, sizeof(shader->config)); ptr =3D read_data(ptr, &shader->info, sizeof(shader->info)); ptr =3D read_chunk(ptr, (void**)&shader->binary.code, &shader->binary.code_size); So, the blob format is: 4 bytes size, 4 bytes CRC, shader config, shader in= fo, code. In mesa-17.3 the si_shader_config was 48 bytes in size, but in Mesa-18.1 and current master, si_shader_config is 52 bytes in size, because the max_simd_= wave field was added. After upgrading mesa to 18.1, with shaders compiled and cached by mesa-17.3, now the above code will obviously not behave as intended. We enter into read_chunk() with the offsets slightly wrong: *size =3D *ptr++; assert(*data =3D=3D NULL); if (!*size) return ptr; *data =3D malloc(*size); return read_data(ptr, *data, *size); and when this code executes, *size has value 3221880836, for a shader that = was only 884 bytes uncompressed. read_data then tries to memcpy this much data,= and that causes the crash. In addition to the lack of invalidation of existing disk caches after the on-disk format was changed, this code also seems rather suspect in that it = does not verify that it is not reading beyond the end of the shader. As an attac= ker I could maliciously rewrite the size field read by the read_chunk() code ab= ove to be very large, fixup the CRC and recompress, and then I could cause other apps to crash in this way. --=20 You are receiving this mail because: You are the assignee for the bug.= --15445145100.E8Be14Ab.16956 Date: Tue, 11 Dec 2018 07:48:30 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated
Bug ID 109007
Summary radeonsi cache format changed, causes mesa crash on startup
Product Mesa
Version git
Hardware Other
OS All
Status NEW
Severity normal
Priority medium
Component Drivers/Gallium/radeonsi
Assignee dri-devel@lists.freedesktop.org
Reporter dan@reactivated.net
QA Contact dri-devel@lists.freedesktop.org

Having upgraded from Mesa-17.3 to Mesa-18.1 in Endless OS, man=
y users on
AMD-based platforms are now reporting that the system fails to boot into the
UI. I've reproduced and confirm that Xorg is crashing very early on.

Thread 4 "si_shader:0" received signal SIGSEGV, Segmentation faul=
t.
__memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1=
533
Backtrace:
#0  __memcpy_ssse3_back ()
    at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1533
#1  0x00007fffeeba2038 in memcpy (__len=3D3221880836, __src=3D0x7fffe4000e7=
0,=20
    __dest=3D<optimized out>) at /usr/include/x86_64-linux-gnu/bits/s=
tring3.h:53
#2  read_data (size=3D3221880836, data=3D<optimized out>, ptr=3D0x7ff=
fe4000e70)
    at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:95
#3  read_chunk (ptr=3D0x7fffe4000e70, ptr@entry=3D0x7fffe4000e6c,=20
    data=3Ddata@entry=3D0x7fffe4000998, size=3Dsize@entry=3D0x7fffe=
4000980)
    at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:121
#4  0x00007fffeeba21b3 in si_load_shader_binary (
    shader=3Dshader@entry=3D0x7fffe40008c0, binary=3Dbinary@entry=
=3D0x7fffe4000e00)
    at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:187
#5  0x00007fffeeba4810 in si_shader_cache_load_shader (shader=3D0x7fffe4000=
8c0,=20
    ir_binary=3D0x7fffe4000a50, sscreen=3D0x555555a393a0)
    at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:275
#6  si_init_shader_selector_async (job=3Djob@entry=3D0x555555b8dfa0,=20
    thread_index=3Dthread_index@entry=3D0)
    at ../../../../../src/gallium/drivers/radeonsi/si_state_shaders.c:1875
#7  0x00007fffee747a55 in util_queue_thread_func (
    input=3Dinput@entry=3D0x555555a39fb0) at ../../../src/util/u_queue.=
c:271
#8  0x00007fffee7476c7 in impl_thrd_routine (p=3D<optimized out>)
    at ../../../include/c11/threads_posix.h:87
#9  0x00007ffff574d494 in start_thread (arg=3D0x7fffebe06700)

The problem here is that the on-disk radeonsi cache format changed without
consideration for this in the code. The affected codepath is
si_load_shader_binary() which does:

        uint32_t size =3D *ptr++;
        uint32_t crc32 =3D *ptr++;
        [...]
        ptr =3D read_data(ptr, &shader->config, sizeof(shader->co=
nfig));
        ptr =3D read_data(ptr, &shader->info, sizeof(shader->info=
));
        ptr =3D read_chunk(ptr, (void**)&shader->binary.code,
                         &shader->binary.code_size);

So, the blob format is: 4 bytes size, 4 bytes CRC, shader config, shader in=
fo,
code.

In mesa-17.3 the si_shader_config was 48 bytes in size, but in Mesa-18.1 and
current master, si_shader_config is 52 bytes in size, because the max_simd_=
wave
field was added.

After upgrading mesa to 18.1, with shaders compiled and cached by mesa-17.3,
now the above code will obviously not behave as intended. We enter into
read_chunk() with the offsets slightly wrong:

        *size =3D *ptr++;
        assert(*data =3D=3D NULL);
        if (!*size)
                return ptr;
        *data =3D malloc(*size);
        return read_data(ptr, *data, *size);

and when this code executes, *size has value 3221880836, for a shader that =
was
only 884 bytes uncompressed. read_data then tries to memcpy this much data,=
 and
that causes the crash.

In addition to the lack of invalidation of existing disk caches after the
on-disk format was changed, this code also seems rather suspect in that it =
does
not verify that it is not reading beyond the end of the shader. As an attac=
ker
I could maliciously rewrite the size field read by the read_chunk() code ab=
ove
to be very large, fixup the CRC and recompress, and then I could cause other
apps to crash in this way.


You are receiving this mail because:
  • You are the assignee for the bug.
= --15445145100.E8Be14Ab.16956-- --===============0482995262== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============0482995262==--