* [dm-crypt] Poor performane (idle cpu)
@ 2010-02-08 1:16 Jakob Sandgren
2010-02-08 1:50 ` Arno Wagner
2010-02-08 23:54 ` Jakob Sandgren
0 siblings, 2 replies; 7+ messages in thread
From: Jakob Sandgren @ 2010-02-08 1:16 UTC (permalink / raw)
To: dm-crypt
Hi,
I'm using dm-crypt for several mappings with a hardware raid backend.
Using a raw read from the raid device (e.g sda) gives ~250MB/s
But when I read from an encrypted mapping, I just get ~70MB/s. That
should be fine if I at least have the kcryptd process using a core at
100%, but that is not the case. Three of my four cores is 99% idle and
one core is 50% idle (aprox.).
I have recently upgraded my hardware from an older quadcore system
(AMD) to a new Core I7 (860) and expected improved performance and
when I did not get that, then did I do some more investegation and
found out above. I have also read posts from others having the same
problems, but no explanation.
Anyone who has an explanation for this or a hint if one could
"tune" anything to get better performance?
Output from "cryptsetup status":
/dev/mapper//dev/mapper/shared1 is active:
cipher: aes-cbc-essiv:sha256
keysize: 128 bits
device: /dev/mapper/areca_0_1-shared1_raw
offset: 1032 sectors
size: 3145726968 sectors
mode: read/write
Best Regards,
Jakob
--
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dm-crypt] Poor performane (idle cpu)
2010-02-08 1:16 [dm-crypt] Poor performane (idle cpu) Jakob Sandgren
@ 2010-02-08 1:50 ` Arno Wagner
2010-02-08 23:54 ` Jakob Sandgren
1 sibling, 0 replies; 7+ messages in thread
From: Arno Wagner @ 2010-02-08 1:50 UTC (permalink / raw)
To: dm-crypt
On Mon, Feb 08, 2010 at 02:16:54AM +0100, Jakob Sandgren wrote:
> Hi,
>
> I'm using dm-crypt for several mappings with a hardware raid backend.
> Using a raw read from the raid device (e.g sda) gives ~250MB/s
>
> But when I read from an encrypted mapping, I just get ~70MB/s. That
> should be fine if I at least have the kcryptd process using a core at
> 100%, but that is not the case. Three of my four cores is 99% idle and
> one core is 50% idle (aprox.).
Which means your core is too slow to support the full 250MB/s
speed.
> I have recently upgraded my hardware from an older quadcore system
> (AMD) to a new Core I7 (860) and expected improved performance and
> when I did not get that, then did I do some more investegation and
> found out above. I have also read posts from others having the same
> problems, but no explanation.
I suspect as the core can support only about 140MB/s encryption
speed, the accesses get broken. It is well possible that
if your array would only give 120MB/s it would still have
that rate encrypted.
> Anyone who has an explanation for this or a hint if one could
> "tune" anything to get better performance?
You may try to encrypt the individual disks and do the
RAID on top of the decrypted disks. That way you should
have more cores participate in the crypto. The problem
with this is of course that this requires the flexibility
of software RAID and hardware RAID cannot do it.
You can also try larger stripe sizes, but that should
make only a minor difference.
You can also try slow your RAID down to something
like 120MB/s in ordert o stay within what a single
core can decrypt.
Just a note on multicore speed: It is often better to have
fewer faster cores than more slower ones. Your problem
is an exapmle for that.
Arno
> Output from "cryptsetup status":
>
> /dev/mapper//dev/mapper/shared1 is active:
> cipher: aes-cbc-essiv:sha256
> keysize: 128 bits
> device: /dev/mapper/areca_0_1-shared1_raw
> offset: 1032 sectors
> size: 3145726968 sectors
> mode: read/write
>
>
> Best Regards,
> Jakob
>
>
> --
>
> _______________________________________________
> dm-crypt mailing list
> dm-crypt@saout.de
> http://www.saout.de/mailman/listinfo/dm-crypt
>
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
If it's in the news, don't worry about it. The very definition of
"news" is "something that hardly ever happens." -- Bruce Schneier
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dm-crypt] Poor performane (idle cpu)
2010-02-08 1:16 [dm-crypt] Poor performane (idle cpu) Jakob Sandgren
2010-02-08 1:50 ` Arno Wagner
@ 2010-02-08 23:54 ` Jakob Sandgren
2010-02-09 0:28 ` Arno Wagner
1 sibling, 1 reply; 7+ messages in thread
From: Jakob Sandgren @ 2010-02-08 23:54 UTC (permalink / raw)
To: dm-crypt
Hi,
(please keep me on CC since I'm not subscribed yet)
>> I'm using dm-crypt for several mappings with a hardware raid backend.
>> Using a raw read from the raid device (e.g sda) gives ~250MB/s
>>
>> But when I read from an encrypted mapping, I just get ~70MB/s. That
>> should be fine if I at least have the kcryptd process using a core
>> at
>> 100%, but that is not the case. Three of my four cores is 99% idle
>> and
>> one core is 50% idle (aprox.).
>
>Which means your core is too slow to support the full 250MB/s
>speed.
>
>> I have recently upgraded my hardware from an older quadcore system
>> (AMD) to a new Core I7 (860) and expected improved performance and
>> when I did not get that, then did I do some more investegation and
>> found out above. I have also read posts from others having the same
>> problems, but no explanation.
>
>I suspect as the core can support only about 140MB/s encryption
>speed, the accesses get broken. It is well possible that
>if your array would only give 120MB/s it would still have
>that rate encrypted.
This does not make sense to me, I can not understand how a "to fast"
disk could give worse results? Disk requests would get issued at the
speed that decryption can handle(?). I do not understand what a
"broken access" would be.
Anyway, just to try the theory did I set up a single disk that would
give 120MB sustained read from the unencrypted mapping, but when I
read from the encrypted mapping I still ended up with the low 70MB/s
and a lot of idle cpu.
Running two reads at the same time (to the same encrypted mapping)
actually increased the combined read rate with ~10% ?!
To me it seems like there is some serious flaw within kcryptd that
ends up to wait for "something" instead of sending enough requests to
the disks to make sure it has data to decrypt. What do you think?
Best Regards,
Jakob Sandgren
--
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dm-crypt] Poor performane (idle cpu)
2010-02-08 23:54 ` Jakob Sandgren
@ 2010-02-09 0:28 ` Arno Wagner
2010-02-09 14:48 ` [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"] Jakob Sandgren
0 siblings, 1 reply; 7+ messages in thread
From: Arno Wagner @ 2010-02-09 0:28 UTC (permalink / raw)
To: dm-crypt; +Cc: Jakob Sandgren
On Tue, Feb 09, 2010 at 12:54:16AM +0100, Jakob Sandgren wrote:
> Hi,
>
> (please keep me on CC since I'm not subscribed yet)
>
> >> I'm using dm-crypt for several mappings with a hardware raid backend.
> >> Using a raw read from the raid device (e.g sda) gives ~250MB/s
> >>
> >> But when I read from an encrypted mapping, I just get ~70MB/s. That
> >> should be fine if I at least have the kcryptd process using a core
> >> at
> >> 100%, but that is not the case. Three of my four cores is 99% idle
> >> and
> >> one core is 50% idle (aprox.).
> >
> >Which means your core is too slow to support the full 250MB/s
> >speed.
> >
> >> I have recently upgraded my hardware from an older quadcore system
> >> (AMD) to a new Core I7 (860) and expected improved performance and
> >> when I did not get that, then did I do some more investegation and
> >> found out above. I have also read posts from others having the same
> >> problems, but no explanation.
> >
> >I suspect as the core can support only about 140MB/s encryption
> >speed, the accesses get broken. It is well possible that
> >if your array would only give 120MB/s it would still have
> >that rate encrypted.
>
>
> This does not make sense to me, I can not understand how a "to fast"
> disk could give worse results? Disk requests would get issued at the
> speed that decryption can handle(?). I do not understand what a
> "broken access" would be.
Ok, if you are not too fast, then data can be read uninterrupted
with maximum size accesses. If you read too fast, then the
disk accesses have to be made smaller and wait for the
decryption to finish. That adds waiting times and data is
not read full speed anymore. It is not so bad with HDDs,
the possibly worst case is tapes: If you process slower
than the tape streams, it has to stop and rewind frequently,
killing performance.
> Anyway, just to try the theory did I set up a single disk that would
> give 120MB sustained read from the unencrypted mapping, but when I
> read from the encrypted mapping I still ended up with the low 70MB/s
> and a lot of idle cpu.
Hmm.
> Running two reads at the same time (to the same encrypted mapping)
> actually increased the combined read rate with ~10% ?!
>
> To me it seems like there is some serious flaw within kcryptd that
> ends up to wait for "something" instead of sending enough requests to
> the disks to make sure it has data to decrypt. What do you think?
The same thing.
Here is a reference test (I have notebook disks in this server):
Raw read: 54MB/s 14% CPU
Read with decrypt: 53MB/s 65% CPU
Another idea: Are these 50% CPU on the faked Intel cores, i.e.
50% of a hyperthreading core? This could actually mean 100%
on a proper core. You get twice as many hyperthreading pseudo
cores, but they are not full cores and can often not perform
at 100% as some infrastructure is shared between two halfes of
them. So if one half runs at 100% and one half at 0% (as the
other half needs something available only once at full load),
the complete core load could be reported as 50% when in fact
it is 100%.
That would mean the crypto is pretty slow on your new CPU.
As a reference, my 53MB/s at 65% CPU is on an 2800MHz Athlon
64 X2 5600+ with aes-cbc-plain.
Here is an OpenSSL crypto speed test:
openssl speed -evp aes-256-cbc
[...]
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 71848.00k 98649.49k 110187.78k 113646.25k 114666.15k
You might want to compare this with the numbers on your CPU.
Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
If it's in the news, don't worry about it. The very definition of
"news" is "something that hardly ever happens." -- Bruce Schneier
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"]
2010-02-09 0:28 ` Arno Wagner
@ 2010-02-09 14:48 ` Jakob Sandgren
2010-02-09 16:14 ` Arno Wagner
2010-02-09 16:16 ` M Thomas Frederiksen
0 siblings, 2 replies; 7+ messages in thread
From: Jakob Sandgren @ 2010-02-09 14:48 UTC (permalink / raw)
To: dm-crypt
On Tue, Feb 09, 2010 at 01:28:06AM +0100, Arno Wagner wrote:
> On Tue, Feb 09, 2010 at 12:54:16AM +0100, Jakob Sandgren wrote:
> > Hi,
> >
> > (please keep me on CC since I'm not subscribed yet)
> >
> > To me it seems like there is some serious flaw within kcryptd that
> > ends up to wait for "something" instead of sending enough requests to
> > the disks to make sure it has data to decrypt. What do you think?
>
> The same thing.
>
> Here is a reference test (I have notebook disks in this server):
>
> Raw read: 54MB/s 14% CPU
> Read with decrypt: 53MB/s 65% CPU
For reference this is the exact output of my benchmark, maybe there
are some difference in setup or benchmark?
OOOOOooops! While putting toghether this information I actually found
the cause of the problem, it was my benchark that was wrong!
This was the benchmark I used to get the performance was:
dd if=/dev/mapper/bench1 bs=4M iflag=direct |pv | dd of=/dev/null
and the number reported by "pv" during the run was ~75MB/s and dd
reported the same number when finished.
Changing this to:
dd if=/dev/mapper/bench1 bs=4M iflag=direct of=/dev/null count=1000
gave a more correct number; 125MB/s
I was not aware of that piping the data through pv would cause such a
big degradation in performance.
> That would mean the crypto is pretty slow on your new CPU.
> As a reference, my 53MB/s at 65% CPU is on an 2800MHz Athlon
> 64 X2 5600+ with aes-cbc-plain.
>
> Here is an OpenSSL crypto speed test:
> openssl speed -evp aes-256-cbc
> [...]
> The 'numbers' are in 1000s of bytes per second processed.
> type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
> aes-256-cbc 71848.00k 98649.49k 110187.78k 113646.25k 114666.15k
>
> You might want to compare this with the numbers on your CPU.
The numbers from my system (Core I7) are below
root@mvh:~# openssl speed -evp aes-256-cbc
...
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 110769.28k 118629.67k 120600.15k 121138.86k 121206.10k
I has now been able to get a 175MB/sec from my main raid partition.
Best Regards,
Jakob
--
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"]
2010-02-09 14:48 ` [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"] Jakob Sandgren
@ 2010-02-09 16:14 ` Arno Wagner
2010-02-09 16:16 ` M Thomas Frederiksen
1 sibling, 0 replies; 7+ messages in thread
From: Arno Wagner @ 2010-02-09 16:14 UTC (permalink / raw)
To: dm-crypt
[-- Attachment #1: Type: text/plain, Size: 2979 bytes --]
On Tue, Feb 09, 2010 at 03:48:29PM +0100, Jakob Sandgren wrote:
> On Tue, Feb 09, 2010 at 01:28:06AM +0100, Arno Wagner wrote:
> > On Tue, Feb 09, 2010 at 12:54:16AM +0100, Jakob Sandgren wrote:
> > > Hi,
> > >
> > > (please keep me on CC since I'm not subscribed yet)
> > >
> > > To me it seems like there is some serious flaw within kcryptd that
> > > ends up to wait for "something" instead of sending enough requests to
> > > the disks to make sure it has data to decrypt. What do you think?
> >
> > The same thing.
> >
> > Here is a reference test (I have notebook disks in this server):
> >
> > Raw read: 54MB/s 14% CPU
> > Read with decrypt: 53MB/s 65% CPU
>
> For reference this is the exact output of my benchmark, maybe there
> are some difference in setup or benchmark?
>
> OOOOOooops! While putting toghether this information I actually found
> the cause of the problem, it was my benchark that was wrong!
>
> This was the benchmark I used to get the performance was:
> dd if=/dev/mapper/bench1 bs=4M iflag=direct |pv | dd of=/dev/null
> and the number reported by "pv" during the run was ~75MB/s and dd
> reported the same number when finished.
>
> Changing this to:
> dd if=/dev/mapper/bench1 bs=4M iflag=direct of=/dev/null count=1000
> gave a more correct number; 125MB/s
>
>
> I was not aware of that piping the data through pv would cause such a
> big degradation in performance.
>
Ah, yes. I think it is historic and uses some things
that are slow.
I have my own tool (wcs for "wc-stream") that is a bit
faster and provides real-time speeds on a vt100. It is
really small. Sources are attached.
> > That would mean the crypto is pretty slow on your new CPU.
> > As a reference, my 53MB/s at 65% CPU is on an 2800MHz Athlon
> > 64 X2 5600+ with aes-cbc-plain.
> >
> > Here is an OpenSSL crypto speed test:
> > openssl speed -evp aes-256-cbc
> > [...]
> > The 'numbers' are in 1000s of bytes per second processed.
> > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
> > aes-256-cbc 71848.00k 98649.49k 110187.78k 113646.25k 114666.15k
> >
> > You might want to compare this with the numbers on your CPU.
>
> The numbers from my system (Core I7) are below
>
> root@mvh:~# openssl speed -evp aes-256-cbc
> ...
> The 'numbers' are in 1000s of bytes per second processed.
> type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
> aes-256-cbc 110769.28k 118629.67k 120600.15k 121138.86k 121206.10k
>
> I has now been able to get a 175MB/sec from my main raid partition.
Good.
Arno
--
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@wagner.name
GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans
If it's in the news, don't worry about it. The very definition of
"news" is "something that hardly ever happens." -- Bruce Schneier
[-- Attachment #2: wcs.c --]
[-- Type: text/x-csrc, Size: 2396 bytes --]
/* "wc-follow": wc with incremental output every second or so.
* Line positioning is done like in dd_rescue with
* "optimistic positioning" i.e. the hope that we have a terminal that
* understands vt100 positioning.
*
* (C) Arno Wagner <arno@wagner.name> 2008. Distributed under
* The GNU public license v2
*
* Version 1.0
*
* Compile with "gcc -O6 -o wcs wcs.c"
*/
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
const char* up = "\x1b[A"; //]
const char* down = "\n";
const char* right = "\x1b[C"; //]
char * usage() {
return("\n"
" Prints out running count and rate statistics about the data\n"
" read on stdin to stderr. Does use cursor positioning, which\n"
" should work on most terminals (vt100 or later).\n"
"\n"
" stdin is copied through to stdout, much like in tee.\n"
" \n"
" This programm does not support any commandline arguments.\n"
"\n"
"\n");
}
void printlong(long long l) {
if (l < 1000000L)
fprintf(stderr, "%7.3f kB", l/1000.0);
else if (l < 1000000000)
fprintf(stderr, "%7.3f MB", l/1000000.0);
else if (l < 1000000000000LL)
fprintf(stderr, "%7.3f GB", l/1000000000.0);
else
fprintf(stderr, "%7.3f TB", l/1000000000000.0);
}
int main(int argc, char ** argv) {
char buf[4096];
long long cnt = 0;
time_t to, tn, ts, dt;
int read_count;
double rate;
if (argc > 1) {
fprintf(stderr, "%s", usage());
exit(1);
}
to = time(NULL);
ts = to;
fprintf(stderr, down);
while (1) {
read_count = read (0, buf, sizeof(buf));
if (read_count < 0 && errno == EINTR) continue; // not an error
if (read_count < 0) {
perror("Abnormal condition in read from stdin:");
exit(1);
}
if (read_count > 0) {
cnt += read_count;
if (fwrite (buf, 1, read_count, stdout) != read_count) {
perror("Error in wcs writing to stdout:");
exit(1);
}
}
tn = time(NULL);
if (tn != to || read_count == 0) {
to = tn;
fprintf(stderr, "%s read: ", up);
printlong(cnt);
fprintf(stderr, " [ %12Ld B] avg: ", cnt);
// Calculate the rate
dt = tn - ts;
rate = cnt / (double)dt;
printlong((long long) rate);
fprintf(stderr, "/sec [ %5d sec]%s", dt, down);
}
if (read_count == 0) break; // we are done
}
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"]
2010-02-09 14:48 ` [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"] Jakob Sandgren
2010-02-09 16:14 ` Arno Wagner
@ 2010-02-09 16:16 ` M Thomas Frederiksen
1 sibling, 0 replies; 7+ messages in thread
From: M Thomas Frederiksen @ 2010-02-09 16:16 UTC (permalink / raw)
To: Jakob Sandgren; +Cc: dm-crypt
[-- Attachment #1: Type: text/plain, Size: 1323 bytes --]
Hi Folks,
I have four disks. I installed kubuntu 9.10 alt, using /dev/sda, /dev/sdb,
and /dev/sdc. I had long used /dev/sdd1 (single partition for the whole
disk) as luks. I kept all my backups on that disk, and didn't touch it at
all during the instillation. The installation program said that /dev/sdd1
was LVM, which I thot was odd, however I didn't use or set up LVM at all on
any of the disks (hence didn't go into the settup LVM part of the install
program).
Post install when I run cryptsetup:
thomas@Aristotle:~$ sudo cryptsetup luksOpen /dev/sdd1 backup_crypt
Enter LUKS passphrase:
Command failed: /dev/sdd1 is not a LUKS partition
thomas@Aristotle:~$ sudo cryptsetup luksDump /dev/sdd1
Command failed: /dev/sdd1 is not a LUKS partition
thomas@Aristotle:~$ shoot me now
shoot: command not found
There are two things I can think of:
1. The install program wrote out a new partition table, changing the
partition type flag.
2. The /dev/sdXs were named differently, and I've scribbled all over what
was /dev/sdd1.
In the first case I think I have hope, in the second... If one is true what
steps should I take?
--
Cheers,
~Thomas
Samuel Goldwyn<http://www.brainyquote.com/quotes/authors/s/samuel_goldwyn.html>
- "I'm willing to admit that I may not always be right, but I am never
wrong."
[-- Attachment #2: Type: text/html, Size: 2824 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-02-09 16:16 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-08 1:16 [dm-crypt] Poor performane (idle cpu) Jakob Sandgren
2010-02-08 1:50 ` Arno Wagner
2010-02-08 23:54 ` Jakob Sandgren
2010-02-09 0:28 ` Arno Wagner
2010-02-09 14:48 ` [dm-crypt] Poor performane (idle cpu) [SOLVED; problem with "pv"] Jakob Sandgren
2010-02-09 16:14 ` Arno Wagner
2010-02-09 16:16 ` M Thomas Frederiksen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.