linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Claes Fransson <claes.v.fransson@gmail.com>
To: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: bad key ordering - repairable?
Date: Sat, 27 Jan 2018 18:42:42 +0100	[thread overview]
Message-ID: <CAEY8F1rp9_rCozyRQWrdma1NpqNx+G6QmVfPQL0yaeP76bSi=Q@mail.gmail.com> (raw)
In-Reply-To: <CAEY8F1pVrZnf3M6mGJaxogx14ZrJ5CV3++_-y13sTniJ3ds4ww@mail.gmail.com>

2018-01-27 18:32 GMT+01:00 Claes Fransson <claes.v.fransson@gmail.com>:
>
> Duncan Wed, 24 Jan 2018 15:18:25 -0800
>
> Claes Fransson posted on Wed, 24 Jan 2018 20:44:33 +0100 as excerpted:
>
> > So, I have now some results from the PassMark Memtest86! I let the
> > default automatic tests run for about 19 hours and 16 passes. It
> > reported zero "Errors", but 4 lines of "[Note] RAM may be vulnerable to
> > high frequency row hammer bit flips". If I understand it correctly,
> > it means that some errors were detected when the RAM was tested at
> > higher rates than guaranteed accurate by the vendors.
>
> >From Wikipedia:
>
>> Row hammer (also written as rowhammer) is an unintended side effect in
>> dynamic random-access memory (DRAM) that causes memory cells to leak
>> their charges and interact electrically between themselves, possibly
>> altering the contents of nearby memory rows that were not addressed in
>> the original memory access. This circumvention of the isolation between
>> DRAM memory cells results from the high cell density in modern DRAM, and
>> can be triggered by specially crafted memory access patterns that rapidly
>> activate the same memory rows numerous times.[1][2][3]
>>
>> The row hammer effect has been used in some privilege escalation computer
>> security exploits.
>>
>> https://en.wikipedia.org/wiki/Row_hammer
>>
>> So it has nothing to do with (generic) testing the RAM at higher rates
>> than guaranteed by the vendors, but rather, with deliberate rapid
>> repeated access (at normal clock rates) of the same cell rows in ordered
>> to trigger a bitflip in nearby memory cells that could not normally be
>> accessed due to process separation and insufficient privileges.
>
>
Well, I was thinking of the specific error message by memtest86.
According to the PassMark website,
https://www.memtest86.com/troubleshooting.htm, "Why am I only getting
errors during Test 13 Hammer Test?", second paragraph.
Thanks for the Wikipedia explanation though.
>
>> IOW, it's unlikely to be accidentally tripped, and thus is exceedingly
>> unlikely to be relevant here, unless you're being hacked, of course.
>
>
Okay, thanks for your conclusion.
>
>>
> That said, and entirely unrelated to rowhammer, I know one of the
> problems of memory test false-negatives from experience.
>
> In my case, I was even running ECC RAM.  But the memory I had purchased
> (back in the day when memory was far more expensive and sub-GB memory was
> the norm) was cheap, and as it happened, marked as stable at slightly
> higher clock rates than it actually was.  But I couldn't afford more (or
> I'd have procured less dodgy RAM in the first place) and had little
> recourse but to live with it for awhile.  A year or so later there was a
> BIOS update that added better memory clocking control, and I was able to
> declock the RAM slightly from its rating (IIRC to PC-3000 level, it was
> PC3200 rated, this was DDR1 era), after which it was /entirely/ stable,
> even after reducing some of the wait-state settings somewhat to try to
> claw back some of what I lost due to the underclocking.
>
> I run gentoo, and nearly all of my problems occurred when I was doing
> updates, building packages at 100% CPU with multiple cores accessing the
> same RAM.  FWIW, the most frequent /detected/ problem was bunzip checksum
> errors as it decompressed and verified the data in memory (before writing
> out)... that would move or go away if I tried again.  Occasionally I'd
> get machine-check errors (MCEs), but not frequently, and the ECC RAM
> subsystem /never/ reported errors.
>
My filesystem went readonly just after I did some updating of a lot of
packages (I think it was thousands of packages :) ), so massive
disk-IO for me, but possible also some CPU and RAM usage...
>
>> But the memory tests gave that memory an all-clear.
>
>
>>> The problem with the memory tests in this case is that they tend to work
>>> on an otherwise unloaded system, and test the retention of the memory
>>> cells, /not/ so much the speed and reliability at which they are accessed
>>> under fully loaded system stress -- and how could they when memory speed
>>> is normally set by the BIOS and not something the memory tester has
>>> access to?
>>>
>>> But my memory problems weren't with the memory cells themselves -- they
>>> retained their data just fine and indeed it was ECC RAM so would have
>>> triggered ECC errors if they didn't -- but with the precision timing of
>>> memory IO -- it wasn't quite up to the specs it claimed to support and
>>> would occasionally produce in-transit errors (the ECC would have detected
>>> and possibly corrected errors in storage), and the memory testers simply
>>> didn't test that like a fully loaded system doing unpacks of sources and
>>> builds from them did.
>>>
>>> As mentioned, once I got a BIOS update that let me declock the RAM a bit,
>>> everything was fine, and it remained fine when I did upgrade the RAM some
>>> years later, after prices had fallen, as well.
>
>
Thanks for telling, but unfortunately I do not have any setting to
change the clocking of the RAM on my laptop when booting into the
BIOS-settings menus.

Claes
>
>> (The system was first-gen AMD Opteron, on a server-grade Tyan board, that
>> I ran from purchase in late 2003 for over eight years, maxing out the
>>
>> pair of CPUs to dual-core Opteron 290s and the RAM to 8 gigs, over time,
>> until the board finally died in 2012 due to burst capacitors.  Which
>> reminds me, I'm still running the replacement, a Gigabyte with an fx6100
>> overclocked a bit to 3.9 GHz and 16 gig RAM, and it's now nearing six
>> years old, so I suppose I better start planning for the next upgrade...
>> I've spent that six years upgrading to big-screen TVs as monitors, with a
>> 65inch/165cm 4K as my primary now and a 48inch/122cm as a secondary to
>> put youtube or whatever on fullscreen, and to now my second generation of
>> ssds, a pair of 1 TB samsung evos, but this reminds me that at nearing
>> six years old the main system's aging too, so I better start thinking of
>> replacing it again...)
>>
>> --
>> Duncan - List replies preferred.   No HTML msgs.
>> "Every nonfree program has a lord, a master --
>> and if you use the program, he is your master."  Richard Stallman
>>
>> --

  parent reply	other threads:[~2018-01-27 17:42 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-22 21:06 bad key ordering - repairable? Claes Fransson
2018-01-22 21:22 ` Hugo Mills
2018-01-23 13:06   ` Claes Fransson
2018-01-23 18:13     ` Claes Fransson
2018-01-24  0:31       ` Chris Murphy
2018-01-24 19:44         ` Claes Fransson
2018-01-24 23:15           ` Duncan
     [not found]           ` <CAEY8F1pVrZnf3M6mGJaxogx14ZrJ5CV3++_-y13sTniJ3ds4ww@mail.gmail.com>
2018-01-27 17:42             ` Claes Fransson [this message]
2018-01-27 14:54     ` Claes Fransson
2018-01-23  2:35 ` Chris Murphy
2018-01-23 12:51   ` Austin S. Hemmelgarn
2018-01-23 13:29     ` Claes Fransson
2018-01-24  0:44     ` Chris Murphy
2018-01-24 12:30       ` Austin S. Hemmelgarn
2018-01-24 23:54         ` Chris Murphy
2018-01-25 12:41           ` Austin S. Hemmelgarn
2018-01-23 13:17   ` Claes Fransson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEY8F1rp9_rCozyRQWrdma1NpqNx+G6QmVfPQL0yaeP76bSi=Q@mail.gmail.com' \
    --to=claes.v.fransson@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).