* [dm-crypt] Poor performane (idle cpu) @ 2010-02-08 1:16 Jakob Sandgren 2010-02-08 1:50 ` Arno Wagner 2010-02-08 23:54 ` Jakob Sandgren 0 siblings, 2 replies; 7+ messages in thread From: Jakob Sandgren @ 2010-02-08 1:16 UTC (permalink / raw) To: dm-crypt Hi, I'm using dm-crypt for several mappings with a hardware raid backend. Using a raw read from the raid device (e.g sda) gives ~250MB/s But when I read from an encrypted mapping, I just get ~70MB/s. That should be fine if I at least have the kcryptd process using a core at 100%, but that is not the case. Three of my four cores is 99% idle and one core is 50% idle (aprox.). I have recently upgraded my hardware from an older quadcore system (AMD) to a new Core I7 (860) and expected improved performance and when I did not get that, then did I do some more investegation and found out above. I have also read posts from others having the same problems, but no explanation. Anyone who has an explanation for this or a hint if one could "tune" anything to get better performance? Output from "cryptsetup status": /dev/mapper//dev/mapper/shared1 is active: cipher: aes-cbc-essiv:sha256 keysize: 128 bits device: /dev/mapper/areca_0_1-shared1_raw offset: 1032 sectors size: 3145726968 sectors mode: read/write Best Regards, Jakob -- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dm-crypt] Poor performane (idle cpu) 2010-02-08 1:16 [dm-crypt] Poor performane (idle cpu) Jakob Sandgren @ 2010-02-08 1:50 ` Arno Wagner 2010-02-08 23:54 ` Jakob Sandgren 1 sibling, 0 replies; 7+ messages in thread From: Arno Wagner @ 2010-02-08 1:50 UTC (permalink / raw) To: dm-crypt On Mon, Feb 08, 2010 at 02:16:54AM +0100, Jakob Sandgren wrote: > Hi, > > I'm using dm-crypt for several mappings with a hardware raid backend. > Using a raw read from the raid device (e.g sda) gives ~250MB/s > > But when I read from an encrypted mapping, I just get ~70MB/s. That > should be fine if I at least have the kcryptd process using a core at > 100%, but that is not the case. Three of my four cores is 99% idle and > one core is 50% idle (aprox.). Which means your core is too slow to support the full 250MB/s speed. > I have recently upgraded my hardware from an older quadcore system > (AMD) to a new Core I7 (860) and expected improved performance and > when I did not get that, then did I do some more investegation and > found out above. I have also read posts from others having the same > problems, but no explanation. I suspect as the core can support only about 140MB/s encryption speed, the accesses get broken. It is well possible that if your array would only give 120MB/s it would still have that rate encrypted. > Anyone who has an explanation for this or a hint if one could > "tune" anything to get better performance? You may try to encrypt the individual disks and do the RAID on top of the decrypted disks. That way you should have more cores participate in the crypto. The problem with this is of course that this requires the flexibility of software RAID and hardware RAID cannot do it. You can also try larger stripe sizes, but that should make only a minor difference. You can also try slow your RAID down to something like 120MB/s in ordert o stay within what a single core can decrypt. Just a note on multicore speed: It is often better to have fewer faster cores than more slower ones. Your problem is an exapmle for that. Arno > Output from "cryptsetup status": > > /dev/mapper//dev/mapper/shared1 is active: > cipher: aes-cbc-essiv:sha256 > keysize: 128 bits > device: /dev/mapper/areca_0_1-shared1_raw > offset: 1032 sectors > size: 3145726968 sectors > mode: read/write > > > Best Regards, > Jakob > > > -- > > _______________________________________________ > dm-crypt mailing list > dm-crypt@saout.de > http://www.saout.de/mailman/listinfo/dm-crypt > -- Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F ---- Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans If it's in the news, don't worry about it. The very definition of "news" is "something that hardly ever happens." -- Bruce Schneier ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dm-crypt] Poor performane (idle cpu) 2010-02-08 1:16 [dm-crypt] Poor performane (idle cpu) Jakob Sandgren 2010-02-08 1:50 ` Arno Wagner @ 2010-02-08 23:54 ` Jakob Sandgren 2010-02-09 0:28 ` Arno Wagner 1 sibling, 1 reply; 7+ messages in thread From: Jakob Sandgren @ 2010-02-08 23:54 UTC (permalink / raw) To: dm-crypt Hi, (please keep me on CC since I'm not subscribed yet) >> I'm using dm-crypt for several mappings with a hardware raid backend. >> Using a raw read from the raid device (e.g sda) gives ~250MB/s >> >> But when I read from an encrypted mapping, I just get ~70MB/s. That >> should be fine if I at least have the kcryptd process using a core >> at >> 100%, but that is not the case. Three of my four cores is 99% idle >> and >> one core is 50% idle (aprox.). > >Which means your core is too slow to support the full 250MB/s >speed. > >> I have recently upgraded my hardware from an older quadcore system >> (AMD) to a new Core I7 (860) and expected improved performance and >> when I did not get that, then did I do some more investegation and >> found out above. I have also read posts from others having the same >> problems, but no explanation. > >I suspect as the core can support only about 140MB/s encryption >speed, the accesses get broken. It is well possible that >if your array would only give 120MB/s it would still have >that rate encrypted. This does not make sense to me, I can not understand how a "to fast" disk could give worse results? Disk requests would get issued at the speed that decryption can handle(?). I do not understand what a "broken access" would be. Anyway, just to try the theory did I set up a single disk that would give 120MB sustained read from the unencrypted mapping, but when I read from the encrypted mapping I still ended up with the low 70MB/s and a lot of idle cpu. Running two reads at the same time (to the same encrypted mapping) actually increased the combined read rate with ~10% ?! To me it seems like there is some serious flaw within kcryptd that ends up to wait for "something" instead of sending enough requests to the disks to make sure it has data to decrypt. What do you think? Best Regards, Jakob Sandgren -- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dm-crypt] Poor performane (idle cpu) 2010-02-08 23:54 ` Jakob Sandgren @ 2010-02-09 0:28 ` Arno Wagner 2010-02-09 14:48 ` [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"] Jakob Sandgren 0 siblings, 1 reply; 7+ messages in thread From: Arno Wagner @ 2010-02-09 0:28 UTC (permalink / raw) To: dm-crypt; +Cc: Jakob Sandgren On Tue, Feb 09, 2010 at 12:54:16AM +0100, Jakob Sandgren wrote: > Hi, > > (please keep me on CC since I'm not subscribed yet) > > >> I'm using dm-crypt for several mappings with a hardware raid backend. > >> Using a raw read from the raid device (e.g sda) gives ~250MB/s > >> > >> But when I read from an encrypted mapping, I just get ~70MB/s. That > >> should be fine if I at least have the kcryptd process using a core > >> at > >> 100%, but that is not the case. Three of my four cores is 99% idle > >> and > >> one core is 50% idle (aprox.). > > > >Which means your core is too slow to support the full 250MB/s > >speed. > > > >> I have recently upgraded my hardware from an older quadcore system > >> (AMD) to a new Core I7 (860) and expected improved performance and > >> when I did not get that, then did I do some more investegation and > >> found out above. I have also read posts from others having the same > >> problems, but no explanation. > > > >I suspect as the core can support only about 140MB/s encryption > >speed, the accesses get broken. It is well possible that > >if your array would only give 120MB/s it would still have > >that rate encrypted. > > > This does not make sense to me, I can not understand how a "to fast" > disk could give worse results? Disk requests would get issued at the > speed that decryption can handle(?). I do not understand what a > "broken access" would be. Ok, if you are not too fast, then data can be read uninterrupted with maximum size accesses. If you read too fast, then the disk accesses have to be made smaller and wait for the decryption to finish. That adds waiting times and data is not read full speed anymore. It is not so bad with HDDs, the possibly worst case is tapes: If you process slower than the tape streams, it has to stop and rewind frequently, killing performance. > Anyway, just to try the theory did I set up a single disk that would > give 120MB sustained read from the unencrypted mapping, but when I > read from the encrypted mapping I still ended up with the low 70MB/s > and a lot of idle cpu. Hmm. > Running two reads at the same time (to the same encrypted mapping) > actually increased the combined read rate with ~10% ?! > > To me it seems like there is some serious flaw within kcryptd that > ends up to wait for "something" instead of sending enough requests to > the disks to make sure it has data to decrypt. What do you think? The same thing. Here is a reference test (I have notebook disks in this server): Raw read: 54MB/s 14% CPU Read with decrypt: 53MB/s 65% CPU Another idea: Are these 50% CPU on the faked Intel cores, i.e. 50% of a hyperthreading core? This could actually mean 100% on a proper core. You get twice as many hyperthreading pseudo cores, but they are not full cores and can often not perform at 100% as some infrastructure is shared between two halfes of them. So if one half runs at 100% and one half at 0% (as the other half needs something available only once at full load), the complete core load could be reported as 50% when in fact it is 100%. That would mean the crypto is pretty slow on your new CPU. As a reference, my 53MB/s at 65% CPU is on an 2800MHz Athlon 64 X2 5600+ with aes-cbc-plain. Here is an OpenSSL crypto speed test: openssl speed -evp aes-256-cbc [...] The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-256-cbc 71848.00k 98649.49k 110187.78k 113646.25k 114666.15k You might want to compare this with the numbers on your CPU. Arno -- Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F ---- Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans If it's in the news, don't worry about it. The very definition of "news" is "something that hardly ever happens." -- Bruce Schneier ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"] 2010-02-09 0:28 ` Arno Wagner @ 2010-02-09 14:48 ` Jakob Sandgren 2010-02-09 16:14 ` Arno Wagner 2010-02-09 16:16 ` M Thomas Frederiksen 0 siblings, 2 replies; 7+ messages in thread From: Jakob Sandgren @ 2010-02-09 14:48 UTC (permalink / raw) To: dm-crypt On Tue, Feb 09, 2010 at 01:28:06AM +0100, Arno Wagner wrote: > On Tue, Feb 09, 2010 at 12:54:16AM +0100, Jakob Sandgren wrote: > > Hi, > > > > (please keep me on CC since I'm not subscribed yet) > > > > To me it seems like there is some serious flaw within kcryptd that > > ends up to wait for "something" instead of sending enough requests to > > the disks to make sure it has data to decrypt. What do you think? > > The same thing. > > Here is a reference test (I have notebook disks in this server): > > Raw read: 54MB/s 14% CPU > Read with decrypt: 53MB/s 65% CPU For reference this is the exact output of my benchmark, maybe there are some difference in setup or benchmark? OOOOOooops! While putting toghether this information I actually found the cause of the problem, it was my benchark that was wrong! This was the benchmark I used to get the performance was: dd if=/dev/mapper/bench1 bs=4M iflag=direct |pv | dd of=/dev/null and the number reported by "pv" during the run was ~75MB/s and dd reported the same number when finished. Changing this to: dd if=/dev/mapper/bench1 bs=4M iflag=direct of=/dev/null count=1000 gave a more correct number; 125MB/s I was not aware of that piping the data through pv would cause such a big degradation in performance. > That would mean the crypto is pretty slow on your new CPU. > As a reference, my 53MB/s at 65% CPU is on an 2800MHz Athlon > 64 X2 5600+ with aes-cbc-plain. > > Here is an OpenSSL crypto speed test: > openssl speed -evp aes-256-cbc > [...] > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes > aes-256-cbc 71848.00k 98649.49k 110187.78k 113646.25k 114666.15k > > You might want to compare this with the numbers on your CPU. The numbers from my system (Core I7) are below root@mvh:~# openssl speed -evp aes-256-cbc ... The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-256-cbc 110769.28k 118629.67k 120600.15k 121138.86k 121206.10k I has now been able to get a 175MB/sec from my main raid partition. Best Regards, Jakob -- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"] 2010-02-09 14:48 ` [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"] Jakob Sandgren @ 2010-02-09 16:14 ` Arno Wagner 2010-02-09 16:16 ` M Thomas Frederiksen 1 sibling, 0 replies; 7+ messages in thread From: Arno Wagner @ 2010-02-09 16:14 UTC (permalink / raw) To: dm-crypt [-- Attachment #1: Type: text/plain, Size: 2979 bytes --] On Tue, Feb 09, 2010 at 03:48:29PM +0100, Jakob Sandgren wrote: > On Tue, Feb 09, 2010 at 01:28:06AM +0100, Arno Wagner wrote: > > On Tue, Feb 09, 2010 at 12:54:16AM +0100, Jakob Sandgren wrote: > > > Hi, > > > > > > (please keep me on CC since I'm not subscribed yet) > > > > > > To me it seems like there is some serious flaw within kcryptd that > > > ends up to wait for "something" instead of sending enough requests to > > > the disks to make sure it has data to decrypt. What do you think? > > > > The same thing. > > > > Here is a reference test (I have notebook disks in this server): > > > > Raw read: 54MB/s 14% CPU > > Read with decrypt: 53MB/s 65% CPU > > For reference this is the exact output of my benchmark, maybe there > are some difference in setup or benchmark? > > OOOOOooops! While putting toghether this information I actually found > the cause of the problem, it was my benchark that was wrong! > > This was the benchmark I used to get the performance was: > dd if=/dev/mapper/bench1 bs=4M iflag=direct |pv | dd of=/dev/null > and the number reported by "pv" during the run was ~75MB/s and dd > reported the same number when finished. > > Changing this to: > dd if=/dev/mapper/bench1 bs=4M iflag=direct of=/dev/null count=1000 > gave a more correct number; 125MB/s > > > I was not aware of that piping the data through pv would cause such a > big degradation in performance. > Ah, yes. I think it is historic and uses some things that are slow. I have my own tool (wcs for "wc-stream") that is a bit faster and provides real-time speeds on a vt100. It is really small. Sources are attached. > > That would mean the crypto is pretty slow on your new CPU. > > As a reference, my 53MB/s at 65% CPU is on an 2800MHz Athlon > > 64 X2 5600+ with aes-cbc-plain. > > > > Here is an OpenSSL crypto speed test: > > openssl speed -evp aes-256-cbc > > [...] > > The 'numbers' are in 1000s of bytes per second processed. > > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes > > aes-256-cbc 71848.00k 98649.49k 110187.78k 113646.25k 114666.15k > > > > You might want to compare this with the numbers on your CPU. > > The numbers from my system (Core I7) are below > > root@mvh:~# openssl speed -evp aes-256-cbc > ... > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes > aes-256-cbc 110769.28k 118629.67k 120600.15k 121138.86k 121206.10k > > I has now been able to get a 175MB/sec from my main raid partition. Good. Arno -- Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F ---- Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans If it's in the news, don't worry about it. The very definition of "news" is "something that hardly ever happens." -- Bruce Schneier [-- Attachment #2: wcs.c --] [-- Type: text/x-csrc, Size: 2396 bytes --] /* "wc-follow": wc with incremental output every second or so. * Line positioning is done like in dd_rescue with * "optimistic positioning" i.e. the hope that we have a terminal that * understands vt100 positioning. * * (C) Arno Wagner <arno@wagner.name> 2008. Distributed under * The GNU public license v2 * * Version 1.0 * * Compile with "gcc -O6 -o wcs wcs.c" */ #include <stdlib.h> #include <stdio.h> #include <errno.h> const char* up = "\x1b[A"; //] const char* down = "\n"; const char* right = "\x1b[C"; //] char * usage() { return("\n" " Prints out running count and rate statistics about the data\n" " read on stdin to stderr. Does use cursor positioning, which\n" " should work on most terminals (vt100 or later).\n" "\n" " stdin is copied through to stdout, much like in tee.\n" " \n" " This programm does not support any commandline arguments.\n" "\n" "\n"); } void printlong(long long l) { if (l < 1000000L) fprintf(stderr, "%7.3f kB", l/1000.0); else if (l < 1000000000) fprintf(stderr, "%7.3f MB", l/1000000.0); else if (l < 1000000000000LL) fprintf(stderr, "%7.3f GB", l/1000000000.0); else fprintf(stderr, "%7.3f TB", l/1000000000000.0); } int main(int argc, char ** argv) { char buf[4096]; long long cnt = 0; time_t to, tn, ts, dt; int read_count; double rate; if (argc > 1) { fprintf(stderr, "%s", usage()); exit(1); } to = time(NULL); ts = to; fprintf(stderr, down); while (1) { read_count = read (0, buf, sizeof(buf)); if (read_count < 0 && errno == EINTR) continue; // not an error if (read_count < 0) { perror("Abnormal condition in read from stdin:"); exit(1); } if (read_count > 0) { cnt += read_count; if (fwrite (buf, 1, read_count, stdout) != read_count) { perror("Error in wcs writing to stdout:"); exit(1); } } tn = time(NULL); if (tn != to || read_count == 0) { to = tn; fprintf(stderr, "%s read: ", up); printlong(cnt); fprintf(stderr, " [ %12Ld B] avg: ", cnt); // Calculate the rate dt = tn - ts; rate = cnt / (double)dt; printlong((long long) rate); fprintf(stderr, "/sec [ %5d sec]%s", dt, down); } if (read_count == 0) break; // we are done } } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"] 2010-02-09 14:48 ` [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"] Jakob Sandgren 2010-02-09 16:14 ` Arno Wagner @ 2010-02-09 16:16 ` M Thomas Frederiksen 1 sibling, 0 replies; 7+ messages in thread From: M Thomas Frederiksen @ 2010-02-09 16:16 UTC (permalink / raw) To: Jakob Sandgren; +Cc: dm-crypt [-- Attachment #1: Type: text/plain, Size: 1323 bytes --] Hi Folks, I have four disks. I installed kubuntu 9.10 alt, using /dev/sda, /dev/sdb, and /dev/sdc. I had long used /dev/sdd1 (single partition for the whole disk) as luks. I kept all my backups on that disk, and didn't touch it at all during the instillation. The installation program said that /dev/sdd1 was LVM, which I thot was odd, however I didn't use or set up LVM at all on any of the disks (hence didn't go into the settup LVM part of the install program). Post install when I run cryptsetup: thomas@Aristotle:~$ sudo cryptsetup luksOpen /dev/sdd1 backup_crypt Enter LUKS passphrase: Command failed: /dev/sdd1 is not a LUKS partition thomas@Aristotle:~$ sudo cryptsetup luksDump /dev/sdd1 Command failed: /dev/sdd1 is not a LUKS partition thomas@Aristotle:~$ shoot me now shoot: command not found There are two things I can think of: 1. The install program wrote out a new partition table, changing the partition type flag. 2. The /dev/sdXs were named differently, and I've scribbled all over what was /dev/sdd1. In the first case I think I have hope, in the second... If one is true what steps should I take? -- Cheers, ~Thomas Samuel Goldwyn<http://www.brainyquote.com/quotes/authors/s/samuel_goldwyn.html> - "I'm willing to admit that I may not always be right, but I am never wrong." [-- Attachment #2: Type: text/html, Size: 2824 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-02-09 16:16 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-02-08 1:16 [dm-crypt] Poor performane (idle cpu) Jakob Sandgren 2010-02-08 1:50 ` Arno Wagner 2010-02-08 23:54 ` Jakob Sandgren 2010-02-09 0:28 ` Arno Wagner 2010-02-09 14:48 ` [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"] Jakob Sandgren 2010-02-09 16:14 ` Arno Wagner 2010-02-09 16:16 ` M Thomas Frederiksen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.