* [dm-crypt] dm-crypt + mdadm [not found] <942152294.28287.1385136420109.JavaMail.root@medent.com> @ 2013-11-22 16:29 ` Adam Boyhan 2013-11-22 22:53 ` Arno Wagner 0 siblings, 1 reply; 5+ messages in thread From: Adam Boyhan @ 2013-11-22 16:29 UTC (permalink / raw) To: dm-crypt [-- Attachment #1: Type: text/plain, Size: 1070 bytes --] Doing allot of testing with dm-crypt and mdadm. Running into an issues which is killing me. We run a standard CentOS6 load with kernel "2.6.32-358.23.2.el6.x86_64" Scenario #1: mdadm raid10 -> dm-crypt -> ext4 Scenario #2: dm-crypt -> mdadm raid10 -> ext4 Scenario #1 has no issues and runs reliably but takes a significant hit in performance due to the single threaded nature of kryptd. I read quite a bit about people instead encrypting each block device then putting software encryption on that. This all goes well until I start to benchmark. The writing process of the benchmark goes well, as soon as I hit the read portion of the test, the machine panics and locks up. I initially tested this idea with raid0 and I don't have this panic, it seems that only raid10 causes the panic. At this point I am stumped as to what I can do. I am feel this is more an issue with mdadm than dm-crypt but I figured it was worth getting others opinions. I appreciate any or all help, if I am missing any details let me know and I will provide them. Thanks, Adam Boyhan [-- Attachment #2: Type: text/html, Size: 1392 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dm-crypt] dm-crypt + mdadm 2013-11-22 16:29 ` [dm-crypt] dm-crypt + mdadm Adam Boyhan @ 2013-11-22 22:53 ` Arno Wagner 2013-11-23 8:10 ` Milan Broz 0 siblings, 1 reply; 5+ messages in thread From: Arno Wagner @ 2013-11-22 22:53 UTC (permalink / raw) To: dm-crypt On Fri, Nov 22, 2013 at 17:29:57 CET, Adam Boyhan wrote: > Doing allot of testing with dm-crypt and mdadm. Running into an issues > which is killing me. We run a standard CentOS6 load with kernel > "2.6.32-358.23.2.el6.x86_64" > > Scenario #1: mdadm raid10 -> dm-crypt -> ext4 > Scenario #2: dm-crypt -> mdadm raid10 -> ext4 > > Scenario #1 has no issues and runs reliably but takes a significant hit in > performance due to the single threaded nature of kryptd. I read quite a > bit about people instead encrypting each block device then putting > software encryption on that. This all goes well until I start to > benchmark. The writing process of the benchmark goes well, as soon as I > hit the read portion of the test, the machine panics and locks up. I > initially tested this idea with raid0 and I don't have this panic, it > seems that only raid10 causes the panic. At this point I am stumped as to > what I can do. I am feel this is more an issue with mdadm than dm-crypt > but I figured it was worth getting others opinions. I appreciate any or > all help, if I am missing any details let me know and I will provide them. I have Scenario #1 running for years with a 3-way RAID1 (notebook drives give me about 1 firmware crash/year, hence the 3-way...) I do not have a performance hit. Writing runs at a sustained speed of ~75MB/s with 3 "kworker"s at about 15% CPU each or 1 kworker at 45% (it varies between the two states). Reading is similar. CPU is a Phenom II 945 4-core. That speed is about what the disks can handle with ext3, non-encrypted gives about the same speed. (Not using ext4, still too new for me.) Kernel is a self-compiled stock kernel.org kernel 3.10.17. I would suggest that 2.6.32 is at best called "historic" and entirely unsuitable for any kind of performance-critical set-up. I also strongly suspect (from some evaluation work on CentOS I did recently for a customer) that Red-Hat has messed up updating their own kernel and you may do better with a self-compiled 2.6.32.61. The real benchmark would be with something like 3.10.20 or the like, or maybe 3.12.1. (Careful: In my setup, 3.10.19 has a networking issue, but 3.10.17 has been running stable for some time, so I would recommend that one.) My bottom line is that a) I do not trust the CentOS "stable" kernel one bit. b) 2.6.32 is so old, doing anything on it to improve performance is probably a waste of time. 2.6.32 is 4 years old by now and the 2.6.x line is 10 years old... Arno -- Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: arno@wagner.name GnuPG: ID: CB5D9718 FP: 12D6 C03B 1B30 33BB 13CF B774 E35C 5FA1 CB5D 9718 ---- There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult. --Tony Hoare ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dm-crypt] dm-crypt + mdadm 2013-11-22 22:53 ` Arno Wagner @ 2013-11-23 8:10 ` Milan Broz 2013-11-23 11:58 ` Arno Wagner 0 siblings, 1 reply; 5+ messages in thread From: Milan Broz @ 2013-11-23 8:10 UTC (permalink / raw) To: dm-crypt; +Cc: adamb On 22.11.2013 23:53, Arno Wagner wrote: > My bottom line is that > a) I do not trust the CentOS "stable" kernel one bit. > b) 2.6.32 is so old, doing anything on it to improve > performance is probably a waste of time. 2.6.32 is 4 years > old by now and the 2.6.x line is 10 years old... Sorry, but this is way too far generalization and typical mistake with "enterprise" kernels :) 2.6.32 is baseline for REHL6/CentOS 6 kernels. But many patches are backported there and it get a lot of more testing. (And CentOS should be just rebuild of RHEL kernel.) Yes, they can mess things up sometimes, that's why there are stable kernels (no new features) and paid support for RHEL. In many areas kernel number means nothing, it contains features from 3.12 etc (specially for device-mapper it is almostalways true). (And my experience is that RHEL/CentOS kernel are pretty stable in comparison with manual upstream builds. I would not suggest any customer to build own kernel for these enterprise distributions - note you need to update and maintain userspace bit as well, handle security updates and you have to know some internal differences.) But in the RHEL6 dmcrypt case: yes, there is single threaded implementation (every dmcrypt device has own single thread). This is kind of exception, because usually RHEL device-mapper stack follows upstream. This is no longer true for upstream where it uses per-cpu threads (limitation is that it always processes work on cpu which submitted it, so there are still some issues). Backporting this to RHEL6 is near to impossible (not only technical reasons, think certification etc) but I cannot speak for Red Hat anymore. (Just you can be sure you are not the first requesting it and I spent quite a lot time playing with it.) So complaining to RHEL/CentOS kernel here makes no sense, please fill bugzilla or use their support channel. For upstream, please test current kernel. You should get better performance (depends on scenario). (Mikulas posted several times another approach to dmcrypt parallelization bus I am still not convinced it really helps. And currently there are some changes upstream in MD and block devices subsystem which influences it as well). Anyway, general suggestion now is to use fast CPU with AES-NI switched on. (This can saturate even very fast RAID arrays with single thread, IIRC I saw throughput >500MB for recent server hw & RHEL6 kernel). Milan ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dm-crypt] dm-crypt + mdadm 2013-11-23 8:10 ` Milan Broz @ 2013-11-23 11:58 ` Arno Wagner 2013-11-23 13:29 ` Milan Broz 0 siblings, 1 reply; 5+ messages in thread From: Arno Wagner @ 2013-11-23 11:58 UTC (permalink / raw) To: dm-crypt On Sat, Nov 23, 2013 at 09:10:08 CET, Milan Broz wrote: > On 22.11.2013 23:53, Arno Wagner wrote: > >My bottom line is that > >a) I do not trust the CentOS "stable" kernel one bit. > >b) 2.6.32 is so old, doing anything on it to improve > > performance is probably a waste of time. 2.6.32 is 4 years > > old by now and the 2.6.x line is 10 years old... > > > Sorry, but this is way too far generalization and typical mistake > with "enterprise" kernels :) I see you feel strongly about this ;-) But I don't think so. Maybe I was not clear enough. Certainly I did not give the details I base that statement on. > 2.6.32 is baseline for REHL6/CentOS 6 kernels. But many patches > are backported there and it get a lot of more testing. > (And CentOS should be just rebuild of RHEL kernel.) > > Yes, they can mess things up sometimes, that's why there are stable > kernels (no new features) and paid support for RHEL. I had 3 immediate issues with older, reliable hardware that never gave me any trouble with any other kernels: 1. Kernel panic due to missing radeon firmware image. Newer kernels do not panic on this. Older ones do not even detect the integrated graphis as anything special, just as VGA, and hence do not panic either. I finally had to switch off the inegrated graphics to get it to boot at all. 2. A perfectly fine RTL8139 card was not even recognized. 3. A perfectly fine RTL8168 card caused the driver to spam the syslog and text-console with warnings and made it unusable. (No X11 on this installation....) Sorry, but that is not what I would call a stable kernel. Feels more like something hacked together... The only possible explantion I have is that this very standard hardware is not "enterpriese" enough (being cheap) and did not make it on the hardware compatibility list and hence does not get tested well or at all. I did not check that list and usually I do not use RHEL/CentOS. This was a customer installation that I evaluated, and I was shocked to run into 3 serious issues with the kernel right away. > In many areas kernel number means nothing, it contains > features from 3.12 etc (specially for device-mapper it is > almostalways true). > > (And my experience is that RHEL/CentOS kernel are pretty stable > in comparison with manual upstream builds. I would not > suggest any customer to build own kernel for these enterprise > distributions The suggestion was for tests, not to create a stable installation this way. If there is still a performance-hit with a current kernel, then it is likely not a kernel issue. If not, then it is and moving away from CentOS may be a viable solution... > - note you need to update and maintain userspace bit as well, > handle security updates and you have to know some internal differences.) True. For example with newer kernels I had issues that some include files were moved around and the include-path for Debian stable gcc is not sufficient anymore. Fixing that was a bit tricky. > But in the RHEL6 dmcrypt case: yes, there is single threaded implementation > (every dmcrypt device has own single thread). This is kind of exception, > because usually RHEL device-mapper stack follows upstream. That was the part why I called it "too old". Not in general, not driver-wise, and not stability-wise, but with a really old kernel some things cannot reasonably be back-ported and bottlenecks that require architecural fixes may be impossible. Hence my statement that it does not make sense to do performance optimization on an old kernel architecture. > This is no longer true for upstream where it uses per-cpu threads > (limitation is that it always processes work on cpu which submitted it, > so there are still some issues). > > Backporting this to RHEL6 is near to impossible (not only technical > reasons, think certification etc) but I cannot speak for Red Hat anymore. > (Just you can be sure you are not the first requesting it and I spent > quite a lot time playing with it.) I did not request this, I know it does not really make sense ;-) > So complaining to RHEL/CentOS kernel here makes no sense, please > fill bugzilla or use their support channel. Not a complaint, just a comment. > For upstream, please test current kernel. You should get better > performance (depends on scenario). > > (Mikulas posted several times another approach to dmcrypt parallelization > bus I am still not convinced it really helps. And currently there are > some changes upstream in MD and block devices subsystem which influences > it as well). > > Anyway, general suggestion now is to use fast CPU with AES-NI switched on. > (This can saturate even very fast RAID arrays with single thread, > IIRC I saw throughput >500MB for recent server hw & RHEL6 kernel). Yes, I would think so. And that way you prevent possible incompatibilities between kernel and the rest of the distribution. And if you pay for RHEL, that config gets you support, while a custom kernel (understandably) gets you a shrug. Arno -- Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: arno@wagner.name GnuPG: ID: CB5D9718 FP: 12D6 C03B 1B30 33BB 13CF B774 E35C 5FA1 CB5D 9718 ---- There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult. --Tony Hoare ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [dm-crypt] dm-crypt + mdadm 2013-11-23 11:58 ` Arno Wagner @ 2013-11-23 13:29 ` Milan Broz 0 siblings, 0 replies; 5+ messages in thread From: Milan Broz @ 2013-11-23 13:29 UTC (permalink / raw) To: dm-crypt On 23.11.2013 12:58, Arno Wagner wrote: > I had 3 immediate issues with older, reliable hardware that > never gave me any trouble with any other kernels: > > 1. Kernel panic due to missing radeon firmware image. > Newer kernels do not panic on this. Older ones do not > even detect the integrated graphis as anything special, > just as VGA, and hence do not panic either. > I finally had to switch off the inegrated graphics to get > it to boot at all. I think we are off topic here, but do not forget I was paid for several years to work on kernel parts in RHEL :) Once you install new hw, you have to also install fw packages and ebuild initramfs. For VGA you can quite safely boot in text mode and fix it later, it is just special know-how about some kernel options... And yes, it is **** > 2. A perfectly fine RTL8139 card was not even recognized. First, this Realtek is really cheap hw, but it works. There are many revisions and I guess you had some new which was not yet recognised - there are also some workarounds but you just cannot expect installation will detect hw which was not available when image was built. (There are new installation images in RHN available for minor updates.) > 3. A perfectly fine RTL8168 card caused the driver to spam > the syslog and text-console with warnings and made it unusable. > (No X11 on this installation....) > > Sorry, but that is not what I would call a stable kernel. Feels > more like something hacked together... Well, this is exactly what support should solve. But usually short search on RH bugzilla works as well. (And yes, I was sometimes desperate how it works but that's another story :) Enterprise Linux is always trade-off - you cannot have stable ABI for ~10years, expect 5 years old installation images to work on new motherboards. Usually next update works though. So in general, this kind of distro is not the best option for you. I think Debian or something similar will work much better. > The only possible explantion I have is that this very standard > hardware is not "enterpriese" enough (being cheap) and did not > make it on the hardware compatibility list and hence does not > get tested well or at all. Obviously not everything is tested but I am sure these are. (I had many such cheap cards myself - and working :) >> (And my experience is that RHEL/CentOS kernel are pretty stable >> in comparison with manual upstream builds. I would not >> suggest any customer to build own kernel for these enterprise >> distributions > > The suggestion was for tests, not to create a stable installation > this way. If there is still a performance-hit with a current > kernel, then it is likely not a kernel issue. If not, then > it is and moving away from CentOS may be a viable solution... Regarding performance hit - it CAN be userspace as well. For storage, if you need to handle multipath, lvm2 etc you know what I am talking about. (E.g. RAID misalignment because of using old userspace tool is very common.) ... Anyway, sorry for spam :) Milan ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-11-23 13:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <942152294.28287.1385136420109.JavaMail.root@medent.com>
2013-11-22 16:29 ` [dm-crypt] dm-crypt + mdadm Adam Boyhan
2013-11-22 22:53 ` Arno Wagner
2013-11-23 8:10 ` Milan Broz
2013-11-23 11:58 ` Arno Wagner
2013-11-23 13:29 ` Milan Broz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox