* memcpy is leaking secret data through ZMM vector registers
@ 2024-04-19 14:07 Mikulas Patocka
2024-04-19 14:19 ` H.J. Lu
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Mikulas Patocka @ 2024-04-19 14:07 UTC (permalink / raw)
To: libc-alpha; +Cc: Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel
Hi
As a part of LVM2, we are developing the libdevmapper library. The library
may be used to load cryptographic keys to the kernel, so it avoids leaking
the data to kernel memory and to the swap partition.
After the use of cryptographic data, the libdevmapper library clears them
with memset and frees them afterwards. It executes __asm__ volatile("" :::
"memory") to thwart some compiler optimization regarding writing to
to-be-freed memory.
We have a test "dmsecuretest.sh" that loads cryptographic keys into the
kernel, dumps a core, the core file is analyzed and if it contains the
key, the test fails.
This test fails on AMD Zen 4 - the reason for the failure is that the
"memcpy" function uses ZMM registers for data copying. When memcpy exits,
the encryption key is present in the ZMM registers and the key remains
there even after both source and destination buffers of memcpy were
cleared.
When we perform dynamic symbol lookup, the ZMM registers are spilled on
the stack and they remain there forever - this is the reason why the core
file contains the encryption key and the test fails.
I'd like to ask what to do with it? We could use LD_BIND_NOW=1 (or
-Wl,-z,now) - it mostly works, but not entirely - the key may still be
present on the stack even if we use LD_BIND_NOW=1.
When I hack the file glibc/sysdeps/x86_64/multiarch/ifunc-memmove.h so
that it always selects the ERMS variant of memcpy, the problem goes away.
Could it be possible to add some switch to glibc, that could be turned on
by security-sensitive programs and that would prevent glibc from using the
vector registers? Or, do you suggest another solution?
Mikulas
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 14:07 memcpy is leaking secret data through ZMM vector registers Mikulas Patocka @ 2024-04-19 14:19 ` H.J. Lu 2024-04-19 14:24 ` Mikulas Patocka 2024-04-21 1:20 ` Andreas K. Huettel 2024-04-22 9:33 ` Szabolcs Nagy 2 siblings, 1 reply; 16+ messages in thread From: H.J. Lu @ 2024-04-19 14:19 UTC (permalink / raw) To: Mikulas Patocka Cc: libc-alpha, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel On Fri, Apr 19, 2024 at 7:08 AM Mikulas Patocka <mpatocka@redhat.com> wrote: > > Hi > > As a part of LVM2, we are developing the libdevmapper library. The library > may be used to load cryptographic keys to the kernel, so it avoids leaking > the data to kernel memory and to the swap partition. > > After the use of cryptographic data, the libdevmapper library clears them > with memset and frees them afterwards. It executes __asm__ volatile("" ::: > "memory") to thwart some compiler optimization regarding writing to > to-be-freed memory. > > We have a test "dmsecuretest.sh" that loads cryptographic keys into the > kernel, dumps a core, the core file is analyzed and if it contains the > key, the test fails. > > This test fails on AMD Zen 4 - the reason for the failure is that the > "memcpy" function uses ZMM registers for data copying. When memcpy exits, > the encryption key is present in the ZMM registers and the key remains > there even after both source and destination buffers of memcpy were > cleared. > > When we perform dynamic symbol lookup, the ZMM registers are spilled on > the stack and they remain there forever - this is the reason why the core > file contains the encryption key and the test fails. > > I'd like to ask what to do with it? We could use LD_BIND_NOW=1 (or > -Wl,-z,now) - it mostly works, but not entirely - the key may still be > present on the stack even if we use LD_BIND_NOW=1. Since vector registers are saved on stack only during symbol lookup, shouldn't disabling lazy binding solve this issue? > When I hack the file glibc/sysdeps/x86_64/multiarch/ifunc-memmove.h so > that it always selects the ERMS variant of memcpy, the problem goes away. > > Could it be possible to add some switch to glibc, that could be turned on > by security-sensitive programs and that would prevent glibc from using the > vector registers? Or, do you suggest another solution? > > Mikulas > -- H.J. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 14:19 ` H.J. Lu @ 2024-04-19 14:24 ` Mikulas Patocka 2024-04-19 14:37 ` H.J. Lu 0 siblings, 1 reply; 16+ messages in thread From: Mikulas Patocka @ 2024-04-19 14:24 UTC (permalink / raw) To: H.J. Lu; +Cc: libc-alpha, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel [-- Attachment #1: Type: text/plain, Size: 765 bytes --] On Fri, 19 Apr 2024, H.J. Lu wrote: > On Fri, Apr 19, 2024 at 7:08 AM Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > I'd like to ask what to do with it? We could use LD_BIND_NOW=1 (or > > -Wl,-z,now) - it mostly works, but not entirely - the key may still be > > present on the stack even if we use LD_BIND_NOW=1. > > Since vector registers are saved on stack only during symbol lookup, > shouldn't disabling lazy binding solve this issue? It should, but it doesn't fix this problem entirely. If I set "GLIBC_TUNABLES=glibc.cpu.hwcaps=-AVX512F,-AVX2" "LD_BIND_NOW=1", I still get a failure (I don't get the failure if I don't set GLIBC_TUNABLES and set only LD_BIND_NOW). So, even if we use plain SSE, the data somehow end up on the stack. Mikulas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 14:24 ` Mikulas Patocka @ 2024-04-19 14:37 ` H.J. Lu 2024-04-19 18:04 ` Mikulas Patocka 0 siblings, 1 reply; 16+ messages in thread From: H.J. Lu @ 2024-04-19 14:37 UTC (permalink / raw) To: Mikulas Patocka Cc: libc-alpha, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel On Fri, Apr 19, 2024 at 7:24 AM Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > On Fri, 19 Apr 2024, H.J. Lu wrote: > > > On Fri, Apr 19, 2024 at 7:08 AM Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > I'd like to ask what to do with it? We could use LD_BIND_NOW=1 (or > > > -Wl,-z,now) - it mostly works, but not entirely - the key may still be > > > present on the stack even if we use LD_BIND_NOW=1. > > > > Since vector registers are saved on stack only during symbol lookup, > > shouldn't disabling lazy binding solve this issue? > > It should, but it doesn't fix this problem entirely. > > If I set "GLIBC_TUNABLES=glibc.cpu.hwcaps=-AVX512F,-AVX2" "LD_BIND_NOW=1", > I still get a failure (I don't get the failure if I don't set > GLIBC_TUNABLES and set only LD_BIND_NOW). > > So, even if we use plain SSE, the data somehow end up on the stack. > You should write your own memory copy function and compile it with -fzero-call-used-regs if possible. -- H.J. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 14:37 ` H.J. Lu @ 2024-04-19 18:04 ` Mikulas Patocka 2024-04-19 18:45 ` Paul Eggert 0 siblings, 1 reply; 16+ messages in thread From: Mikulas Patocka @ 2024-04-19 18:04 UTC (permalink / raw) To: H.J. Lu; +Cc: libc-alpha, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel On Fri, 19 Apr 2024, H.J. Lu wrote: > You should write your own memory copy function and compile it with > -fzero-call-used-regs if possible. > > -- > H.J. This would work - but I looked at OpenSSL and it seems to suffer from the same problem as libdevmapper. OpenSSL uses plain memcpy, it overwrites memory before freeing it, but it doesn't overwrite the YMM and ZMM registers. So, it seems like overkill to add a special memcpy implementation to every library that manipulates sensitive data. It may be better to have some general solution. There's already "explicit_bzero", so maybe we could add "explicit_memcpy" or "secure_memcpy"? Mikulas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 18:04 ` Mikulas Patocka @ 2024-04-19 18:45 ` Paul Eggert 2024-04-19 18:47 ` Zack Weinberg 0 siblings, 1 reply; 16+ messages in thread From: Paul Eggert @ 2024-04-19 18:45 UTC (permalink / raw) To: Mikulas Patocka, H.J. Lu Cc: libc-alpha, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel On 4/19/24 11:04, Mikulas Patocka wrote: > There's already "explicit_bzero", so maybe we could add > "explicit_memcpy" Where would this stop? Wouldn't we also need explicit_memcmp, explicit_memmove, explicit_mempcpy, etc.? Pretty much any function that looks at memory could have the problem. Even C source code that doesn't invoke any C library function could have the problem. On the library side, shouldn't this sort of thing be handled by _FORTIFY_SOURCE or something similar? And don't we need a compiler option saying "don't cache anything in registers"? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 18:45 ` Paul Eggert @ 2024-04-19 18:47 ` Zack Weinberg 2024-04-19 18:53 ` Alexander Monakov 0 siblings, 1 reply; 16+ messages in thread From: Zack Weinberg @ 2024-04-19 18:47 UTC (permalink / raw) To: Paul Eggert, Mikulas Patocka, H . J . Lu Cc: GNU libc development, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel On Fri, Apr 19, 2024, at 2:45 PM, Paul Eggert wrote: > On 4/19/24 11:04, Mikulas Patocka wrote: >> There's already "explicit_bzero", so maybe we could add >> "explicit_memcpy" > > Where would this stop? Wouldn't we also need explicit_memcmp, > explicit_memmove, explicit_mempcpy, etc.? Pretty much any function that > looks at memory could have the problem. Even C source code that doesn't > invoke any C library function could have the problem. As I recall, one of the arguments for _not_ adding explicit_bzero to glibc was that we couldn't guarantee copies of the secret data wouldn't hang around in registers. Is a hypothetical function __attribute__((clear_call_clobbered_regs_on_exit)) what we need here instead, maybe? zw ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 18:47 ` Zack Weinberg @ 2024-04-19 18:53 ` Alexander Monakov 2024-04-19 19:11 ` Zack Weinberg 0 siblings, 1 reply; 16+ messages in thread From: Alexander Monakov @ 2024-04-19 18:53 UTC (permalink / raw) To: Zack Weinberg Cc: Paul Eggert, Mikulas Patocka, H . J . Lu, GNU libc development, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel On Fri, 19 Apr 2024, Zack Weinberg wrote: > On Fri, Apr 19, 2024, at 2:45 PM, Paul Eggert wrote: > > On 4/19/24 11:04, Mikulas Patocka wrote: > >> There's already "explicit_bzero", so maybe we could add > >> "explicit_memcpy" > > > > Where would this stop? Wouldn't we also need explicit_memcmp, > > explicit_memmove, explicit_mempcpy, etc.? Pretty much any function that > > looks at memory could have the problem. Even C source code that doesn't > > invoke any C library function could have the problem. > > As I recall, one of the arguments for _not_ adding explicit_bzero to glibc > was that we couldn't guarantee copies of the secret data wouldn't hang > around in registers. bzero and memset have no reason to read data from memory, they only need to overwrite that memory. This makes them different from memcpy. In the caller of memset/memcpy, sure, copies of that data may be present on registers. > Is a hypothetical function __attribute__((clear_call_clobbered_regs_on_exit)) > what we need here instead, maybe? As indicated upthread, there's a non-hypothetical __attribute__((zero_call_used_regs)), unless you mean something else? Alexander ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 18:53 ` Alexander Monakov @ 2024-04-19 19:11 ` Zack Weinberg 2024-04-19 20:15 ` Mikulas Patocka 0 siblings, 1 reply; 16+ messages in thread From: Zack Weinberg @ 2024-04-19 19:11 UTC (permalink / raw) To: Alexander Monakov Cc: Paul Eggert, Mikulas Patocka, H . J . Lu, GNU libc development, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel On Fri, Apr 19, 2024, at 2:53 PM, Alexander Monakov wrote: > On Fri, 19 Apr 2024, Zack Weinberg wrote: > >> On Fri, Apr 19, 2024, at 2:45 PM, Paul Eggert wrote: >> > On 4/19/24 11:04, Mikulas Patocka wrote: >> >> There's already "explicit_bzero", so maybe we could add >> >> "explicit_memcpy" >> > >> > Where would this stop? Wouldn't we also need explicit_memcmp, >> > explicit_memmove, explicit_mempcpy, etc.? Pretty much any function that >> > looks at memory could have the problem. Even C source code that doesn't >> > invoke any C library function could have the problem. >> >> As I recall, one of the arguments for _not_ adding explicit_bzero to glibc >> was that we couldn't guarantee copies of the secret data wouldn't hang >> around in registers. > > bzero and memset have no reason to read data from memory, they only need > to overwrite that memory. This makes them different from memcpy. Yes, but the compiler does not know that bzero/explicit_bzero/memset only write and do not read, which means if you have something like void aes256_encrypt_in_place(const uint8_t *key, const uint8_t *iv, uint8_t *data, size_t len) { __m128 round_keys[AES256_N_ROUND_KEYS]; aes256_expand_key(key, round_keys); aes256_do_cbc(round_keys, iv, data, len); explicit_bzero(round_keys, sizeof round_keys); } and aes256_expand_key and aes256_do_cbc get inlined, the compiler might be able to keep the entire key schedule in the vector registers *until* the call to explicit_bzero. But right before calling explicit_bzero, it will have to copy the round_keys array onto the stack! And the copy of round_keys in the vector registers *won't* get erased -- the exact problem being discussed in this thread. >> Is a hypothetical function __attribute__((clear_call_clobbered_regs_on_exit)) >> what we need here instead, maybe? > > As indicated upthread, there's a non-hypothetical > __attribute__((zero_call_used_regs)), unless you mean something else? I didn't know whether "call used" meant what I mean by "call clobbered". Also, it's not clear to me whether this is bulletproof (under whatever name). zw ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 19:11 ` Zack Weinberg @ 2024-04-19 20:15 ` Mikulas Patocka 2024-04-19 20:31 ` Zack Weinberg 0 siblings, 1 reply; 16+ messages in thread From: Mikulas Patocka @ 2024-04-19 20:15 UTC (permalink / raw) To: Zack Weinberg Cc: Alexander Monakov, Paul Eggert, H . J . Lu, GNU libc development, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel On Fri, 19 Apr 2024, Zack Weinberg wrote: > Yes, but the compiler does not know that bzero/explicit_bzero/memset only write > and do not read, which means if you have something like > > void aes256_encrypt_in_place(const uint8_t *key, const uint8_t *iv, > uint8_t *data, size_t len) > { > __m128 round_keys[AES256_N_ROUND_KEYS]; > aes256_expand_key(key, round_keys); > aes256_do_cbc(round_keys, iv, data, len); > explicit_bzero(round_keys, sizeof round_keys); > } > > and aes256_expand_key and aes256_do_cbc get inlined, the compiler might > be able to keep the entire key schedule in the vector registers *until* > the call to explicit_bzero. But right before calling explicit_bzero, > it will have to copy the round_keys array onto the stack! And the copy > of round_keys in the vector registers *won't* get erased -- the exact > problem being discussed in this thread. On the SYSV ABI, all the vector registers are volatile, so you can erase them in explicit_bzero. On Windows 64-bit ABI, it is more problematic, because some of the vector registers must be preserved. Mikulas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 20:15 ` Mikulas Patocka @ 2024-04-19 20:31 ` Zack Weinberg 2024-04-19 21:11 ` Mikulas Patocka 0 siblings, 1 reply; 16+ messages in thread From: Zack Weinberg @ 2024-04-19 20:31 UTC (permalink / raw) To: Mikulas Patocka Cc: Alexander Monakov, Paul Eggert, H . J . Lu, GNU libc development, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel On Fri, Apr 19, 2024, at 4:15 PM, Mikulas Patocka wrote: > On Fri, 19 Apr 2024, Zack Weinberg wrote: >> ... the copy >> of round_keys in the vector registers *won't* get erased -- the exact >> problem being discussed in this thread. > > On the SYSV ABI, all the vector registers are volatile, so you can erase > them in explicit_bzero. > > On Windows 64-bit ABI, it is more problematic, because some of the vector > registers must be preserved. Oh, huh. Yes, that would work. Call-preserved registers are not a problem, because any function that puts secret data in a call-preserved register in the first place, must erase it again (by restoring the old value) before returning. Therefore, if we made explicit_bzero wipe *all* the call-clobbered registers before returning, my example function would be safe. There's still a place secrets could leak to and not get erased, though: register spill slots on the stack. Only the compiler could plug this leak. Long term, I think what we want is something like __attribute__((sensitive)), which can only be applied to variables with automatic storage duration, and which means "erase all copies of this variable's value, wherever they wound up, at the end of its lifetime." Note that such variables must not be put in call-preserved registers in non-leaf functions, because then they might get spilled to the stack by a callee, which has no way of knowing that it's just leaked a secret. And I suppose we might also want to worry about signal frames. Nobody said this was gonna be easy ;-) zw ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 20:31 ` Zack Weinberg @ 2024-04-19 21:11 ` Mikulas Patocka 2024-04-19 23:27 ` Florian Weimer 0 siblings, 1 reply; 16+ messages in thread From: Mikulas Patocka @ 2024-04-19 21:11 UTC (permalink / raw) To: Zack Weinberg Cc: Alexander Monakov, Paul Eggert, H . J . Lu, GNU libc development, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel On Fri, 19 Apr 2024, Zack Weinberg wrote: > On Fri, Apr 19, 2024, at 4:15 PM, Mikulas Patocka wrote: > > On Fri, 19 Apr 2024, Zack Weinberg wrote: > >> ... the copy > >> of round_keys in the vector registers *won't* get erased -- the exact > >> problem being discussed in this thread. > > > > On the SYSV ABI, all the vector registers are volatile, so you can erase > > them in explicit_bzero. > > > > On Windows 64-bit ABI, it is more problematic, because some of the vector > > registers must be preserved. > > Oh, huh. Yes, that would work. I've just realized that this wouldn't work - if the function explicit_bzero is lazily resolved, the dynamic linker would spill the vector registers to the stack prior to calling explicit_bzero. > Call-preserved registers are not a > problem, because any function that puts secret data in a call-preserved > register in the first place, must erase it again (by restoring the old > value) before returning. Therefore, if we made explicit_bzero wipe *all* > the call-clobbered registers before returning, my example function would > be safe. > > There's still a place secrets could leak to and not get erased, though: > register spill slots on the stack. Only the compiler could plug this > leak. Long term, I think what we want is something like > __attribute__((sensitive)), which can only be applied to variables with > automatic storage duration, and which means "erase all copies of this > variable's value, wherever they wound up, at the end of its lifetime." > Note that such variables must not be put in call-preserved registers in > non-leaf functions, because then they might get spilled to the stack by > a callee, which has no way of knowing that it's just leaked a secret. > And I suppose we might also want to worry about signal frames. Nobody > said this was gonna be easy ;-) > > zw Yes. Another problem is varargs - if there is at least one floating point argument, the compiler will store 8 XMM registers on the stack regardless of whether they are used or not. In the past it didn't do it (it made indirect jump based on the value in the %AL register to save only the used registers), but someone probably found out that indirect jumps are expensive and that storing all 8 registers is faster. Mikulas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 21:11 ` Mikulas Patocka @ 2024-04-19 23:27 ` Florian Weimer 2024-04-20 3:29 ` Zack Weinberg 0 siblings, 1 reply; 16+ messages in thread From: Florian Weimer @ 2024-04-19 23:27 UTC (permalink / raw) To: Mikulas Patocka Cc: Zack Weinberg, Alexander Monakov, Paul Eggert, H . J . Lu, GNU libc development, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel * Mikulas Patocka: > On Fri, 19 Apr 2024, Zack Weinberg wrote: > >> On Fri, Apr 19, 2024, at 4:15 PM, Mikulas Patocka wrote: >> > On Fri, 19 Apr 2024, Zack Weinberg wrote: >> >> ... the copy >> >> of round_keys in the vector registers *won't* get erased -- the exact >> >> problem being discussed in this thread. >> > >> > On the SYSV ABI, all the vector registers are volatile, so you can erase >> > them in explicit_bzero. >> > >> > On Windows 64-bit ABI, it is more problematic, because some of the vector >> > registers must be preserved. >> >> Oh, huh. Yes, that would work. > > I've just realized that this wouldn't work - if the function > explicit_bzero is lazily resolved, the dynamic linker would spill the > vector registers to the stack prior to calling explicit_bzero. No, the dynamic linker makes a tail call to explicit_bzero. There's no register restore on the return path, all that happens before the tail call. Thanks, Florian ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 23:27 ` Florian Weimer @ 2024-04-20 3:29 ` Zack Weinberg 0 siblings, 0 replies; 16+ messages in thread From: Zack Weinberg @ 2024-04-20 3:29 UTC (permalink / raw) To: Florian Weimer, Mikulas Patocka Cc: Alexander Monakov, Paul Eggert, H . J . Lu, GNU libc development, Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel On Fri, Apr 19, 2024, at 7:27 PM, Florian Weimer wrote: > * Mikulas Patocka: > >> On Fri, 19 Apr 2024, Zack Weinberg wrote: >> >>> On Fri, Apr 19, 2024, at 4:15 PM, Mikulas Patocka wrote: >>> > On Fri, 19 Apr 2024, Zack Weinberg wrote: >>> >> ... the copy >>> >> of round_keys in the vector registers *won't* get erased -- the exact >>> >> problem being discussed in this thread. >>> > >>> > On the SYSV ABI, all the vector registers are volatile, so you can erase >>> > them in explicit_bzero. >>> > >>> > On Windows 64-bit ABI, it is more problematic, because some of the vector >>> > registers must be preserved. >>> >>> Oh, huh. Yes, that would work. >> >> I've just realized that this wouldn't work - if the function >> explicit_bzero is lazily resolved, the dynamic linker would spill the >> vector registers to the stack prior to calling explicit_bzero. > > No, the dynamic linker makes a tail call to explicit_bzero. There's no > register restore on the return path, all that happens before the tail > call. Doesn't help — if the vector registers get spilled at all, we lose. zw ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 14:07 memcpy is leaking secret data through ZMM vector registers Mikulas Patocka 2024-04-19 14:19 ` H.J. Lu @ 2024-04-21 1:20 ` Andreas K. Huettel 2024-04-22 9:33 ` Szabolcs Nagy 2 siblings, 0 replies; 16+ messages in thread From: Andreas K. Huettel @ 2024-04-21 1:20 UTC (permalink / raw) To: libc-alpha Cc: Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel, Mikulas Patocka [-- Attachment #1: Type: text/plain, Size: 1224 bytes --] > We have a test "dmsecuretest.sh" that loads cryptographic keys into the > kernel, dumps a core, the core file is analyzed and if it contains the > key, the test fails. > > This test fails on AMD Zen 4 - the reason for the failure is that the > "memcpy" function uses ZMM registers for data copying. When memcpy exits, > the encryption key is present in the ZMM registers and the key remains > there even after both source and destination buffers of memcpy were > cleared. > > When we perform dynamic symbol lookup, the ZMM registers are spilled on > the stack and they remain there forever - this is the reason why the core > file contains the encryption key and the test fails. So let me ask a few obvious questions, as someone with not (yet) deep insights into the problem. * Shouldn't this be treated as a security issue? * Are the expectations on where the (key) data may end up defined somewhere? * If yes, which component behaves faulty? * If no, who needs to be involved in making the specs? -- Andreas K. Hüttel dilfridge@gentoo.org Gentoo Linux developer (council, comrel, toolchain, base-system, perl, libreoffice) https://wiki.gentoo.org/wiki/User:Dilfridge [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: memcpy is leaking secret data through ZMM vector registers 2024-04-19 14:07 memcpy is leaking secret data through ZMM vector registers Mikulas Patocka 2024-04-19 14:19 ` H.J. Lu 2024-04-21 1:20 ` Andreas K. Huettel @ 2024-04-22 9:33 ` Szabolcs Nagy 2 siblings, 0 replies; 16+ messages in thread From: Szabolcs Nagy @ 2024-04-22 9:33 UTC (permalink / raw) To: Mikulas Patocka, libc-alpha Cc: Zdenek Kabelac, Ondrej Kozina, Milan Broz, dm-devel The 04/19/2024 16:07, Mikulas Patocka wrote: > Hi > > As a part of LVM2, we are developing the libdevmapper library. The library > may be used to load cryptographic keys to the kernel, so it avoids leaking > the data to kernel memory and to the swap partition. > > After the use of cryptographic data, the libdevmapper library clears them > with memset and frees them afterwards. It executes __asm__ volatile("" ::: > "memory") to thwart some compiler optimization regarding writing to > to-be-freed memory. instead of crypto_foo(key); dont_optimize_me_memset(key, 0, sizeof key); can you do crypto_foo(key); memcpy(key, dummykey, sizeof key); crypto_foo(key); memcpy(key, dummykey, sizeof key); if there is no sensitive data based conditional in the code (which there should not be in crypto logic nor in memcpy) the exact same registers and instructions should be exercised twice. i.e. you clobber all state in a portable way, no arch specific magic hack is needed nor new compiler flag. technically this can still leak information in all sorts of ways (c is a high level language, internally the implementation can do whatever with the secrets), but this is pretty much how far you can go within c (pretending otherwise with random weird compiler or libc extensions is a mistake imho). > > We have a test "dmsecuretest.sh" that loads cryptographic keys into the > kernel, dumps a core, the core file is analyzed and if it contains the > key, the test fails. > > This test fails on AMD Zen 4 - the reason for the failure is that the > "memcpy" function uses ZMM registers for data copying. When memcpy exits, > the encryption key is present in the ZMM registers and the key remains > there even after both source and destination buffers of memcpy were > cleared. > > When we perform dynamic symbol lookup, the ZMM registers are spilled on > the stack and they remain there forever - this is the reason why the core > file contains the encryption key and the test fails. > > I'd like to ask what to do with it? We could use LD_BIND_NOW=1 (or > -Wl,-z,now) - it mostly works, but not entirely - the key may still be > present on the stack even if we use LD_BIND_NOW=1. > > When I hack the file glibc/sysdeps/x86_64/multiarch/ifunc-memmove.h so > that it always selects the ERMS variant of memcpy, the problem goes away. > > Could it be possible to add some switch to glibc, that could be turned on > by security-sensitive programs and that would prevent glibc from using the > vector registers? Or, do you suggest another solution? > > Mikulas > ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2024-04-22 9:33 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-04-19 14:07 memcpy is leaking secret data through ZMM vector registers Mikulas Patocka 2024-04-19 14:19 ` H.J. Lu 2024-04-19 14:24 ` Mikulas Patocka 2024-04-19 14:37 ` H.J. Lu 2024-04-19 18:04 ` Mikulas Patocka 2024-04-19 18:45 ` Paul Eggert 2024-04-19 18:47 ` Zack Weinberg 2024-04-19 18:53 ` Alexander Monakov 2024-04-19 19:11 ` Zack Weinberg 2024-04-19 20:15 ` Mikulas Patocka 2024-04-19 20:31 ` Zack Weinberg 2024-04-19 21:11 ` Mikulas Patocka 2024-04-19 23:27 ` Florian Weimer 2024-04-20 3:29 ` Zack Weinberg 2024-04-21 1:20 ` Andreas K. Huettel 2024-04-22 9:33 ` Szabolcs Nagy
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.