From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.saout.de ([127.0.0.1]) by localhost (mail.saout.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wy9jhHYY54Va for ; Wed, 14 Sep 2011 16:04:32 +0200 (CEST) Received: from mail-ey0-f178.google.com (mail-ey0-f178.google.com [209.85.215.178]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by mail.saout.de (Postfix) with ESMTPS for ; Wed, 14 Sep 2011 16:04:31 +0200 (CEST) Received: by eye27 with SMTP id 27so783398eye.37 for ; Wed, 14 Sep 2011 07:04:31 -0700 (PDT) Message-ID: <4E70B46C.9050304@gmail.com> Date: Wed, 14 Sep 2011 16:04:28 +0200 From: Peter Merhaut MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="------------020701050505070808030203" Subject: [dm-crypt] slow read performance, but fast writes? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: dm-crypt@saout.de This is a multi-part message in MIME format. --------------020701050505070808030203 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Hi, i've got a really strange problem with my setup, and i'm not quite sure why. First of all, the setup. Got 6x 2TB Samsung hdds, each one encrypted with dm_crypt, on top of the encrypted block devices, there's a raid6. Normally you would assume, that writes are slower than reads, since dm-crypt has to encrypt 2 additional parity stripes (and md raid6 has to calculate the 2 parity stripes). Let's take a look on the write performance first. Here's an average of 10 seconds iostat while running "dd if=/dev/zero of=/daten/testfile bs=1M" avg-cpu: %user %nice %system %iowait %steal %idle 0.12 0.05 94.15 2.17 0.00 3.50 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 9.30 15200.00 1.80 2972.70 44.40 73472.40 49.43 15.53 5.21 201.72 5.09 0.26 76.68 sdd 6.30 14495.00 1.60 3759.60 31.60 73756.00 39.24 2.91 0.77 133.44 0.71 0.19 73.08 sdf 3.10 14519.30 1.70 3650.00 19.20 73429.20 40.23 2.86 0.78 137.06 0.72 0.20 72.52 sde 6.10 14240.10 1.30 3964.20 29.60 73572.40 37.12 2.80 0.70 173.38 0.65 0.18 71.70 sdg 3.10 14859.40 1.10 3306.50 16.80 73410.00 44.40 2.72 0.82 3.45 0.82 0.19 63.06 sdc 0.20 14276.20 1.00 3909.40 4.80 73450.00 37.57 2.65 0.67 41.80 0.66 0.17 65.29 md3 0.00 0.00 0.00 4609.60 0.00 294870.40 127.94 0.00 0.00 0.00 0.00 0.00 0.00 Works as expected. Since this machine is powered by an quadcore, all this crypto and raid6 threads are nicely balanced over all 4 cores. Sequential Write averages out at about 300MB/s. So the CPU is capable of encrypting 6 times 72MB per second (while calculating parity for raid6). Now for the Reads. 10 seconds iostat while running "dd if=/dev/md3 of=/dev/null bs=1M" avg-cpu: %user %nice %system %iowait %steal %idle 0.15 0.05 37.61 19.90 0.00 42.29 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 5700.20 0.00 1141.90 0.00 27507.20 0.00 48.18 6.37 5.52 5.52 0.00 0.42 48.27 sdd 5420.60 0.00 1419.80 0.00 27462.40 0.00 38.68 4.47 3.14 3.14 0.00 0.30 42.12 sdf 5646.90 0.00 1198.10 0.00 27418.00 0.00 45.77 5.50 4.57 4.57 0.00 0.36 42.91 sde 5837.10 0.00 1006.40 0.00 27507.20 0.00 54.66 6.69 6.61 6.61 0.00 0.49 49.45 sdg 5671.10 0.00 1171.30 0.00 27401.20 0.00 46.79 4.65 3.95 3.95 0.00 0.34 39.99 sdc 5994.20 0.00 851.30 0.00 27485.60 0.00 64.57 7.11 8.27 8.27 0.00 0.57 48.44 md3 0.00 0.00 40927.60 0.00 163710.40 0.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00 42% idle and almost 20% iowait? I've already tried everything i thought of. changed kernel CONFIG_HZ, tried a tickless Kernel, enabled/disabled ncq, changed bios from ahci to ide mode, tuned read ahead for harddisks, crypto-devices and raid array (with different combinations). Same thing happens if i start badblocks on all 6 crypto devices. Througput drops to 29MB/s on each drive. When i overclock the CPU (AMD 910e) about 20%, the read rate increases at exactly 20%. Any advice would be greatly appreciated. thanks and best regards, Peter --------------020701050505070808030203 Content-Type: text/html; charset="iso-8859-15" Content-Transfer-Encoding: quoted-printable Hi,

i've got a really strange problem with my setup, and i'm not quite sure why.
First of all, the setup. Got 6x 2TB Samsung hdds, each one encrypted with dm_crypt, on top of the encrypted block devices, there's a raid6.
Normally you would assume, that writes are slower than reads, since dm-crypt has to encrypt 2 additional parity stripes (and md raid6 has to calculate the 2 parity stripes).
Let's take a look on the write performance first. Here's an average of 10 seconds iostat while running "dd if=3D/dev/zero of=3D/daten/testfile bs=3D1M"

avg-cpu:=A0 %user=A0=A0 %nice %system %iowait= =A0 %steal=A0=A0 %idle
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.12=A0=A0=A0 0= .05=A0=A0 94.15=A0=A0=A0 2.17=A0=A0=A0 0.00=A0=A0=A0 3.50

Device:=A0=A0=A0=A0=A0=A0=A0=A0 rrqm/s=A0=A0 w= rqm/s=A0=A0=A0=A0 r/s=A0=A0=A0=A0 w/s=A0=A0=A0 rkB/s=A0=A0=A0 wkB/s avgrq-sz avgqu-sz=A0=A0 await r_awa= it w_await=A0 svctm=A0 %util
sdb=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = 9.30 15200.00=A0=A0=A0 1.80 2972.70=A0=A0=A0 44.40 73472.40=A0=A0=A0 49.43=A0=A0=A0 15.53=A0=A0= =A0 5.21=A0 201.72=A0=A0=A0 5.09=A0=A0 0.26=A0 76.68
sdd=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = 6.30 14495.00=A0=A0=A0 1.60 3759.60=A0=A0=A0 31.60 73756.00=A0=A0=A0 39.24=A0=A0=A0=A0 2.91=A0=A0= =A0 0.77=A0 133.44=A0=A0=A0 0.71=A0=A0 0.19=A0 73.08
sdf=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = 3.10 14519.30=A0=A0=A0 1.70 3650.00=A0=A0=A0 19.20 73429.20=A0=A0=A0 40.23=A0=A0=A0=A0 2.86=A0=A0= =A0 0.78=A0 137.06=A0=A0=A0 0.72=A0=A0 0.20=A0 72.52
sde=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = 6.10 14240.10=A0=A0=A0 1.30 3964.20=A0=A0=A0 29.60 73572.40=A0=A0=A0 37.12=A0=A0=A0=A0 2.80=A0=A0= =A0 0.70=A0 173.38=A0=A0=A0 0.65=A0=A0 0.18=A0 71.70
sdg=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = 3.10 14859.40=A0=A0=A0 1.10 3306.50=A0=A0=A0 16.80 73410.00=A0=A0=A0 44.40=A0=A0=A0=A0 2.72=A0=A0= =A0 0.82=A0=A0=A0 3.45=A0=A0=A0 0.82=A0=A0 0.19=A0 63.06
sdc=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = 0.20 14276.20=A0=A0=A0 1.00 3909.40=A0=A0=A0=A0 4.80 73450.00=A0=A0=A0 37.57=A0=A0=A0=A0 2.65=A0= =A0=A0 0.67=A0=A0 41.80=A0=A0=A0 0.66=A0=A0 0.17=A0 65.29
md3=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = 0.00=A0=A0=A0=A0 0.00=A0=A0=A0 0.00 4609.60=A0=A0=A0=A0 0.00 294870.40=A0=A0 127.94=A0=A0=A0=A0 0.00=A0= =A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00

Works as expected. Since this machine is powered by an quadcore, all this crypto and raid6 threads are nicely balanced over all 4 cores. Sequential Write averages out at about 300MB/s.
So the CPU is capable of encrypting 6 times 72MB per second (while calculating parity for raid6).

Now for the Reads.

10 seconds iostat while running "dd if=3D/dev/md3 of=3D/dev/null bs=3D1= M"

avg-cpu:=A0 %user=A0=A0 %nice %system %iowait= =A0 %steal=A0=A0 %idle
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.15=A0=A0=A0 0.05=A0=A0 37.61=A0=A0 1= 9.90=A0=A0=A0 0.00=A0=A0 42.29

Device:=A0=A0=A0=A0=A0=A0=A0=A0 rrqm/s=A0=A0 wrqm/s=A0=A0=A0=A0 r/s= =A0=A0=A0=A0 w/s=A0=A0=A0 rkB/s=A0=A0=A0 wkB/s avgrq-sz avgqu-sz=A0=A0 await r_await w_await=A0 svctm=A0 %util
sdb=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 5700.20=A0=A0=A0=A0 0.00 1141.90= =A0=A0=A0 0.00 27507.20=A0=A0=A0=A0 0.00=A0=A0=A0 48.18=A0=A0=A0=A0 6.37=A0=A0=A0 5.52=A0=A0=A0 5.52=A0= =A0=A0 0.00=A0=A0 0.42=A0 48.27
sdd=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 5420.60=A0=A0=A0=A0 0.00 1419.80= =A0=A0=A0 0.00 27462.40=A0=A0=A0=A0 0.00=A0=A0=A0 38.68=A0=A0=A0=A0 4.47=A0=A0=A0 3.14=A0=A0=A0 3.14=A0= =A0=A0 0.00=A0=A0 0.30=A0 42.12
sdf=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 5646.90=A0=A0=A0=A0 0.00 1198.10= =A0=A0=A0 0.00 27418.00=A0=A0=A0=A0 0.00=A0=A0=A0 45.77=A0=A0=A0=A0 5.50=A0=A0=A0 4.57=A0=A0=A0 4.57=A0= =A0=A0 0.00=A0=A0 0.36=A0 42.91
sde=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 5837.10=A0=A0=A0=A0 0.00 1006.40= =A0=A0=A0 0.00 27507.20=A0=A0=A0=A0 0.00=A0=A0=A0 54.66=A0=A0=A0=A0 6.69=A0=A0=A0 6.61=A0=A0=A0 6.61=A0= =A0=A0 0.00=A0=A0 0.49=A0 49.45
sdg=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 5671.10=A0=A0=A0=A0 0.00 1171.30= =A0=A0=A0 0.00 27401.20=A0=A0=A0=A0 0.00=A0=A0=A0 46.79=A0=A0=A0=A0 4.65=A0=A0=A0 3.95=A0=A0=A0 3.95=A0= =A0=A0 0.00=A0=A0 0.34=A0 39.99
sdc=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 5994.20=A0=A0=A0=A0 0.00=A0 851.= 30=A0=A0=A0 0.00 27485.60=A0=A0=A0=A0 0.00=A0=A0=A0 64.57=A0=A0=A0=A0 7.11=A0=A0=A0 8.27=A0=A0=A0 8.27=A0= =A0=A0 0.00=A0=A0 0.57=A0 48.44
md3=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00 4= 0927.60=A0=A0=A0 0.00 163710.40=A0=A0=A0=A0 0.00=A0=A0=A0=A0 8.00=A0=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0= =A0=A0 0.00=A0=A0 0.00=A0=A0 0.00

42% idle and almost 20% iowait?
I've already tried everything i thought of. changed kernel CONFIG_HZ, tried a tickless Kernel, enabled/disabled ncq, changed bios from ahci to ide mode, tuned read ahead for harddisks, crypto-devices and raid array (with different combinations).
Same thing happens if i start badblocks on all 6 crypto devices. Througput drops to 29MB/s on each drive. When i overclock the CPU (AMD 910e) about 20%, the read rate increases at exactly 20%.
Any advice would be greatly appreciated.

thanks and best regards, Peter
--------------020701050505070808030203--