* Does /dev/urandom now block until initialised ? @ 2018-07-23 3:43 Ken Moffat 2018-07-23 15:16 ` Theodore Y. Ts'o 0 siblings, 1 reply; 5+ messages in thread From: Ken Moffat @ 2018-07-23 3:43 UTC (permalink / raw) To: Theodore Y. Ts'o, linux-crypto, lkml Ted, last week you proposed an rfc patch to gather entropy from the CPU's hwrng, and I was pleased - until I discovered one of my stalling desktop machines does not have a hwrng. At that point I thought that the problem was only from reading /dev/random, so I went away to look at persuading the immediate consumer (unbound) to use /dev/urandom. Did that, no change. Ran strace from the bootscript, confirmed that only /dev/urandom was being used, and that it seemed to be blocking. Thought maybe this was the olnl problematic bootscript, tried moving it to later, but hit the same problem on chronyd (again, seems to use urandom). And yes, I probably should have started chronyd first anyway, but that's irrelevant to this problem. BUT: I'm not sure if I've correctly understood what is happening. It seems to me that the fix for CVE-2018-1108 (4.17-rc1, 4.16.4) means /dev/urandom will now block until fully initialised. Is that correct and intentional ? If so, to get the affected desktop machines to boot I seem to have some choices: 1. Wait for two and a half minutes (timed on the kaveri, the haswell seemed to take a similar time). 2. Sit at the keyboard and start thumping it once userspace has started. 3. For the haswell, apply your patch and trust that the CPU has not been backdored 4. Run haveged. The latter certainly lets it boot in a reasonable time, but people who understand this seem to regard it as untrustworthy. For users of /dev/urandom that is no big deal, but does it not mean that the values from /dev/random will be similarly untrustworthy and therefore I should not use this machine for generating long-lived secure keys ? TIA. ĸen ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Does /dev/urandom now block until initialised ? 2018-07-23 3:43 Does /dev/urandom now block until initialised ? Ken Moffat @ 2018-07-23 15:16 ` Theodore Y. Ts'o 2018-07-23 16:11 ` Jeffrey Walton 2018-07-23 16:52 ` Ken Moffat 0 siblings, 2 replies; 5+ messages in thread From: Theodore Y. Ts'o @ 2018-07-23 15:16 UTC (permalink / raw) To: Ken Moffat; +Cc: linux-crypto, lkml On Mon, Jul 23, 2018 at 04:43:01AM +0100, Ken Moffat wrote: > Ted, > > last week you proposed an rfc patch to gather entropy from the CPU's > hwrng, and I was pleased - until I discovered one of my stalling > desktop machines does not have a hwrng. At that point I thought that > the problem was only from reading /dev/random, so I went away to look > at persuading the immediate consumer (unbound) to use /dev/urandom. > > Did that, no change. Ran strace from the bootscript, confirmed that > only /dev/urandom was being used, and that it seemed to be blocking. > Thought maybe this was the olnl problematic bootscript, tried moving > it to later, but hit the same problem on chronyd (again, seems to use > urandom). And yes, I probably should have started chronyd first > anyway, but that's irrelevant to this problem. Nope, /dev/urandom still doesn't block. Are you sure it isn't caused by something calling getrandom(2) --- which *will* block? We intentionally left /dev/urandom non-blocking, because of backwards compatibility. > BUT: I'm not sure if I've correctly understood what is happening. > It seems to me that the fix for CVE-2018-1108 (4.17-rc1, 4.16.4) > means /dev/urandom will now block until fully initialised. > > Is that correct and intentional ? No, that's not right. What the fix does is more accurately account for the entropy accounting before getrandom(2) would become non-blocking. There were a bunch of things we were doing wrong, including assuming that 100% of the bytes being sent via add_device_entropy() were random --- when some of the things that were feeding into it was the (fixed) information you would get from running dmidecode (e.g., the fixed results from the BIOS configuration data). Some of those bytes might not be known to an external adversary (such as your CPU mainboard's serial number), but it's not exactly *Secret*. > If so, to get the affected desktop machines to boot I seem to have > some choices... Well, this probably isn't going to be popular, but the other thing that might help is you could switch distro's. I'm guessing you run a Red Hat distro, probably Fedora, right? The problem which most people are seeing turns out to be a terrible interaction between dracut-fips, systemd and a Red Hat specific patch to libgcrypt for FIPS/FEDRAMP compliance: https://src.fedoraproject.org/rpms/libgcrypt/blob/master/f/libgcrypt-1.6.2-fips-ctor.patch#_23 Uninstalling dracut-fips and recreating the initramfs might also help. One of the reasons why I didn't see the problem when I was developing the remediation patch for CVE-2018-1108 is because I run Debian testing, which doesn't have this particular Red Hat patch. > The latter certainly lets it boot in a reasonable time, but people > who understand this seem to regard it as untrustworthy. For users > of /dev/urandom that is no big deal, but does it not mean that the > values from /dev/random will be similarly untrustworthy and > therefore I should not use this machine for generating long-lived > secure keys ? This really depends on how paranoid / careful you are. Remember, your keyboard controller was almost certainly built in Shenzhen, China, and Matt Blaze published a paper on the Jitterbug in 2006: http://www.crypto.com/papers/jbug-Usenix06-final.pdf In practice, after 30 minutes of operation, especially if you are using the keyboard, the entropy pool *will* be sufficiently randomized, whether or not it was sufficientl randomized at boot. The real danger of CVE-2018-1108 was always long-term keys generated at first boot. That was the problem that was discussed in the "Mining your p's and q's: Detection of Widespread Weak Keys in Network Devices" (see https://factorable.net). So generating long-lived keys means (a) you need to be sure you trust all of the software on the system --- some very paranoid people such as Bruce Schneier used a freshly installed machine from CD-ROM that was never attached to the network before examining materials from Edward Snowden, and (b) making sure the entropy pool is initialized. Remember we are constantly feeding input from the hardware sources into the entropy pool; it doesn't stop the moment we think the entropy pool is initialized. And you can always mix extra "stuff" into the entropy pool by echoing the results of say, taking series of dice rolls, aond sending it via the "cat" or "echo" command into /dev/urhandom. So it should be possible to use the machine for generated long lived keys; you might just need to be a bit more careful before you do it. It's really keys generated automatically at boot that are most at risk --- and you can always regenerate the host SSH keys after a fresh install. In fact, what I have done in the past when I first login to a freshly created Cloud VM system is to run command like "dd if=/dev/urandom count=1 bs=256 | od -x", then login to VM, and then run "cat > /dev/urandom", and cut and paste the results of the od -x output into the guest VM, to better initialize the entropy pool on the VM before regenerating the host SSH keys. Cheers, - Ted ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Does /dev/urandom now block until initialised ? 2018-07-23 15:16 ` Theodore Y. Ts'o @ 2018-07-23 16:11 ` Jeffrey Walton 2018-07-23 19:11 ` Theodore Y. Ts'o 2018-07-23 16:52 ` Ken Moffat 1 sibling, 1 reply; 5+ messages in thread From: Jeffrey Walton @ 2018-07-23 16:11 UTC (permalink / raw) To: Theodore Y. Ts'o, Ken Moffat, Linux Crypto Mailing List, lkml On Mon, Jul 23, 2018 at 11:16 AM, Theodore Y. Ts'o <tytso@mit.edu> wrote: > On Mon, Jul 23, 2018 at 04:43:01AM +0100, Ken Moffat wrote: >> ... > One of the reasons why I didn't see the problem when I was developing > the remediation patch for CVE-2018-1108 is because I run Debian > testing, which doesn't have this particular Red Hat patch. Off-topic, I'm kind of surprised it took that long to fix it (if I am parsing things correctly). I believe Stephan Mueller wrote up the weakness a couple of years ago. He's the one who explained the interactions to me. Mueller was even cited at https://github.com/systemd/systemd/issues/4167. It is too bad he Mueller not receive credit for it in the CVE database. Jeff ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Does /dev/urandom now block until initialised ? 2018-07-23 16:11 ` Jeffrey Walton @ 2018-07-23 19:11 ` Theodore Y. Ts'o 0 siblings, 0 replies; 5+ messages in thread From: Theodore Y. Ts'o @ 2018-07-23 19:11 UTC (permalink / raw) To: Jeffrey Walton; +Cc: Ken Moffat, Linux Crypto Mailing List, lkml On Mon, Jul 23, 2018 at 12:11:12PM -0400, Jeffrey Walton wrote: > > I believe Stephan Mueller wrote up the weakness a couple of years ago. > He's the one who explained the interactions to me. Mueller was even > cited at https://github.com/systemd/systemd/issues/4167. Stephan had a lot of complaints about the existing random driver. That's because he has a replacement driver that he has been pushing, and instead of giving explicit complaints with specific patches to fix those specific issues, he have a generalized blast of complaints, plus a "big bang rewrite". I've reviewed his lrng doc, and this specific issue was not among his complaints. Quite a while ago, I had gone through his document, and had specifically addressed each of his complaints. As far as I have been able determine, all of the specific technical complaints (as opposed to personal preference issues) have been addressed. His complaint is a text book complaint about how *not* to file a bug report. That being said, we try to take bug reports from as many sources as possible even if they aren't well formed or submitted in the ideal place. (I'm reminded of Linux's networking scalability limitations which Microsoft filed in the Wall Street Journal 15+ years ago --- and which only applied if you had 4 CPU's and four 10 megabit networking cards; if you had four CPU's and a 100 megabit networking card, Linux would grind Microsoft into the dust; still it was a bug, and we appreciated the report and we fixed it, even if it wasn't filed in the ideal forum. :-) > It is too bad he Mueller not receive credit for it in the CVE database. As near as I can tell, he doesn't deserve it for this particular issue. It's all Jann Horn and Google's Project Zero. (And his writeup is a textbook example of how to report this sort of issue with great specifity and analysis.) - Ted ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Does /dev/urandom now block until initialised ? 2018-07-23 15:16 ` Theodore Y. Ts'o 2018-07-23 16:11 ` Jeffrey Walton @ 2018-07-23 16:52 ` Ken Moffat 1 sibling, 0 replies; 5+ messages in thread From: Ken Moffat @ 2018-07-23 16:52 UTC (permalink / raw) To: Theodore Y. Ts'o, Ken Moffat, linux-crypto, lkml On 23 July 2018 at 16:16, Theodore Y. Ts'o <tytso@mit.edu> wrote: > On Mon, Jul 23, 2018 at 04:43:01AM +0100, Ken Moffat wrote: >> >> Did that, no change. Ran strace from the bootscript, confirmed that >> only /dev/urandom was being used, and that it seemed to be blocking. > > Nope, /dev/urandom still doesn't block. Are you sure it isn't caused > by something calling getrandom(2) --- which *will* block? I'm not at all sure, which was why I asked. > > We intentionally left /dev/urandom non-blocking, because of backwards > compatibility. > >> BUT: I'm not sure if I've correctly understood what is happening. >> It seems to me that the fix for CVE-2018-1108 (4.17-rc1, 4.16.4) >> means /dev/urandom will now block until fully initialised. >> >> Is that correct and intentional ? > > No, that's not right. What the fix does is more accurately account > for the entropy accounting before getrandom(2) would become > non-blocking. There were a bunch of things we were doing wrong, > including assuming that 100% of the bytes being sent via > add_device_entropy() were random --- when some of the things that were > feeding into it was the (fixed) information you would get from running > dmidecode (e.g., the fixed results from the BIOS configuration data). > > Some of those bytes might not be known to an external adversary (such > as your CPU mainboard's serial number), but it's not exactly *Secret*. > >> If so, to get the affected desktop machines to boot I seem to have >> some choices... > > Well, this probably isn't going to be popular, but the other thing > that might help is you could switch distro's. I'm guessing you run a > Red Hat distro, probably Fedora, right? > Wrong, linuxfromscratch (sysv version) and beyond linuxfromscratch plus extras such as chronyd. The only initrd is on the haswell, and just for intel microcode. > The problem which most people are seeing turns out to be a terrible > interaction between dracut-fips, systemd and a Red Hat specific patch > to libgcrypt for FIPS/FEDRAMP compliance: > > https://src.fedoraproject.org/rpms/libgcrypt/blob/master/f/libgcrypt-1.6.2-fips-ctor.patch#_23 > > Uninstalling dracut-fips and recreating the initramfs might also help. > > One of the reasons why I didn't see the problem when I was developing > the remediation patch for CVE-2018-1108 is because I run Debian > testing, which doesn't have this particular Red Hat patch. > >> The latter certainly lets it boot in a reasonable time, but people >> who understand this seem to regard it as untrustworthy. For users >> of /dev/urandom that is no big deal, but does it not mean that the >> values from /dev/random will be similarly untrustworthy and >> therefore I should not use this machine for generating long-lived >> secure keys ? > > This really depends on how paranoid / careful you are. Remember, your > keyboard controller was almost certainly built in Shenzhen, China, and > Matt Blaze published a paper on the Jitterbug in 2006: > > http://www.crypto.com/papers/jbug-Usenix06-final.pdf > > In practice, after 30 minutes of operation, especially if you are > using the keyboard, the entropy pool *will* be sufficiently > randomized, whether or not it was sufficientl randomized at boot. The > real danger of CVE-2018-1108 was always long-term keys generated at > first boot. That was the problem that was discussed in the "Mining > your p's and q's: Detection of Widespread Weak Keys in Network > Devices" (see https://factorable.net). > > So generating long-lived keys means (a) you need to be sure you trust > all of the software on the system --- some very paranoid people such > as Bruce Schneier used a freshly installed machine from CD-ROM that > was never attached to the network before examining materials from > Edward Snowden, and (b) making sure the entropy pool is initialized. > > Remember we are constantly feeding input from the hardware sources > into the entropy pool; it doesn't stop the moment we think the entropy > pool is initialized. And you can always mix extra "stuff" into the > entropy pool by echoing the results of say, taking series of dice > rolls, aond sending it via the "cat" or "echo" command into > /dev/urhandom. > > So it should be possible to use the machine for generated long lived > keys; you might just need to be a bit more careful before you do it. > It's really keys generated automatically at boot that are most at risk > --- and you can always regenerate the host SSH keys after a fresh > install. In fact, what I have done in the past when I first login to > a freshly created Cloud VM system is to run command like "dd > if=/dev/urandom count=1 bs=256 | od -x", then login to VM, and then > run "cat > /dev/urandom", and cut and paste the results of the od -x > output into the guest VM, to better initialize the entropy pool on the > VM before regenerating the host SSH keys. > > Cheers, > > - Ted Thanks. In that case I'll go with the simple fix (haveged). ĸen ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-07-23 19:11 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-07-23 3:43 Does /dev/urandom now block until initialised ? Ken Moffat 2018-07-23 15:16 ` Theodore Y. Ts'o 2018-07-23 16:11 ` Jeffrey Walton 2018-07-23 19:11 ` Theodore Y. Ts'o 2018-07-23 16:52 ` Ken Moffat
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox