* [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD
@ 2012-07-22 19:07 Marc MERLIN
2012-07-22 19:47 ` Yves-Alexis Perez
` (2 more replies)
0 siblings, 3 replies; 37+ messages in thread
From: Marc MERLIN @ 2012-07-22 19:07 UTC (permalink / raw)
To: dm-crypt
I got a new Samsumg 830 512GB SSD which is supposed to be very high
performance.
The raw device seems fast enough on a quick hdparm test:
/dev/sda4:
Timing cached reads: 14258 MB in 2.00 seconds = 7136.70 MB/sec
Timing buffered disk reads: 1392 MB in 3.00 seconds = 463.45 MB/sec <<<<
which is 4x faster than my non encrypted spinning disk, as expected.
But once I encrypt it, it drops to 5 times slower than my 1TB spinning
disk in the same laptop:
gandalfthegreat:~# hdparm -tT /dev/mapper/ssdcrypt
/dev/mapper/ssdcrypt:
Timing cached reads: 15412 MB in 2.00 seconds = 7715.37 MB/sec
Timing buffered disk reads: 70 MB in 3.06 seconds = 22.91 MB/sec <<<<
gandalfthegreat:~# hdparm -tT /dev/mapper/cryptroot (spinning disk)
/dev/mapper/cryptroot:
Timing cached reads: 16222 MB in 2.00 seconds = 8121.03 MB/sec
Timing buffered disk reads: 308 MB in 3.01 seconds = 102.24 MB/sec <<<<
I used aes-xts-plain as recommended on
http://www.mayrhofer.eu.org/ssd-linux-benchmark
gandalfthegreat:~# cryptsetup status /dev/mapper/ssdcrypt
/dev/mapper/ssdcrypt is active.
type: LUKS1
cipher: aes-xts-plain
keysize: 256 bits
device: /dev/sda4
offset: 4096 sectors
size: 926308752 sectors
mode: read/write
I tried
cryptsetup luksFormat --align-payload=8192
the first time, so my offset was 8K, but that did not make a
difference in speed.
gandalfthegreat:~# lsmod |grep -e aes
aesni_intel 50443 66
cryptd 14517 18 ghash_clmulni_intel,aesni_intel
aes_x86_64 16796 1 aesni_intel
Kernel: 3.4.4-amd64
gandalfthegreat:~# cryptsetup --version
cryptsetup 1.4.3
I know that SSDs are weird and all, but getting a raw device speed of a
mere 23MB/sec down from 463MB/s and compared to 102MB/s for a similarly
spinning drive, is a problem, is it not?
Any suggestions would be appreciated.
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-22 19:07 [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD Marc MERLIN @ 2012-07-22 19:47 ` Yves-Alexis Perez 2012-07-22 20:39 ` Marc MERLIN 2012-07-22 20:22 ` Heinz Diehl 2012-08-12 12:49 ` Pasi Kärkkäinen 2 siblings, 1 reply; 37+ messages in thread From: Yves-Alexis Perez @ 2012-07-22 19:47 UTC (permalink / raw) To: Marc MERLIN; +Cc: dm-crypt [-- Attachment #1: Type: text/plain, Size: 970 bytes --] On dim., 2012-07-22 at 12:07 -0700, Marc MERLIN wrote: > I know that SSDs are weird and all, but getting a raw device speed of > a > mere 23MB/sec down from 463MB/s and compared to 102MB/s for a > similarly > spinning drive, is a problem, is it not? > > Any suggestions would be appreciated. I'm using Debian sid (so still at 3.2 kernel), currently using a 256G Samsung SSD. What I get is: root@scapa:~# hdparm -t /dev/sda /dev/mapper/scapa_crypt /dev/sda: Timing buffered disk reads: 690 MB in 3.01 seconds = 229.47 MB/sec /dev/mapper/scapa_crypt: Timing buffered disk reads: 590 MB in 3.00 seconds = 196.43 MB/sec root@scapa:~# cryptsetup status /dev/mapper/scapa_crypt /dev/mapper/scapa_crypt is active and is in use. type: LUKS1 cipher: aes-xts-plain keysize: 256 bits device: /dev/sda2 offset: 4096 sectors size: 499587760 sectors mode: read/write flags: discards Regards, -- Yves-Alexis [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-22 19:47 ` Yves-Alexis Perez @ 2012-07-22 20:39 ` Marc MERLIN 2012-07-22 21:47 ` Arno Wagner 2012-07-22 21:55 ` Marc MERLIN 0 siblings, 2 replies; 37+ messages in thread From: Marc MERLIN @ 2012-07-22 20:39 UTC (permalink / raw) To: Yves-Alexis Perez, dm-crypt, htd On Sun, Jul 22, 2012 at 09:47:32PM +0200, Yves-Alexis Perez wrote: > > Any suggestions would be appreciated. > > I'm using Debian sid (so still at 3.2 kernel), currently using a 256G > Samsung SSD. What I get is: > > root@scapa:~# hdparm -t /dev/sda /dev/mapper/scapa_crypt > > /dev/sda: > Timing buffered disk reads: 690 MB in 3.01 seconds = 229.47 MB/sec > /dev/mapper/scapa_crypt: > Timing buffered disk reads: 590 MB in 3.00 seconds = 196.43 MB/sec Right, that's more what I would expect. On Sun, Jul 22, 2012 at 10:22:13PM +0200, Heinz Diehl wrote: > I don't know why reading speed is that slow in your case, especially > as you are using AES-NI, which should give you the highest speed > available. Maybe others here on the list have a suggestion. Probably, > you should provide some more information. I'm happy to if you tell me what else I can give. Note how my test showed that I had the same aes-xts-plain on the hard drive in the same laptop, and that one runs at 100MB/s, which is pretty much its native speed. Just to make sure that I'm not CPU limited, I tried gandalfthegreat:~# dd if=/dev/sda4 of=/dev/null bs=1M count=1024 1073741824 bytes (1.1 GB) copied, 2.23959 s, 479 MB/s gandalfthegreat:~# dd if=/dev/mapper/ssdcrypt of=/dev/null bs=1M count=1024 1073741824 bytes (1.1 GB) copied, 44.3302 s, 24.2 MB/s atop shows dd isn't really pegging a single core: THR SYSCPU USRCPU RDDSK WRDSK ST EXC S CPUNR CPU CMD 1 0.60s 0.01s 226.2M 0K -- - D 3 6% dd > Otherwise, on the newer Intel i3/i5/i7, twofish-3way is faster than > AES. You could try to re-format your drive with twofish-xts-plain64 > and adding twofish_common, twofish_x86_64 and twofish_x86_64_3way to > your initram (as long as your kernel is built with these enabled). Thanks for the hint. Since it's a lenovo T530 with a recent CPU, I tried it: gandalfthegreat:~# cryptsetup -c twofish-xts-plain64 -s 256 luksFormat /dev/sda4 Result is the same: gandalfthegreat:~# hdparm -tT /dev/mapper/ssdcrypt /dev/mapper/ssdcrypt: Timing cached reads: 16614 MB in 2.00 seconds = 8317.50 MB/sec Timing buffered disk reads: 68 MB in 3.08 seconds = 22.09 MB/sec Grumble. It shouldn't be a hardware problem since I do see 400MB/s before encryption. I'm a bit lost here. I'll try other kernels just in case, but that shouldn't make a difference. Thanks for the answers, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-22 20:39 ` Marc MERLIN @ 2012-07-22 21:47 ` Arno Wagner 2012-07-23 6:07 ` Yves-Alexis Perez 2012-07-23 6:28 ` Marc MERLIN 2012-07-22 21:55 ` Marc MERLIN 1 sibling, 2 replies; 37+ messages in thread From: Arno Wagner @ 2012-07-22 21:47 UTC (permalink / raw) To: dm-crypt On Sun, Jul 22, 2012 at 01:39:29PM -0700, Marc MERLIN wrote: > On Sun, Jul 22, 2012 at 09:47:32PM +0200, Yves-Alexis Perez wrote: > > > Any suggestions would be appreciated. > > > > I'm using Debian sid (so still at 3.2 kernel), currently using a 256G > > Samsung SSD. What I get is: SID? That would be "unstable", whit possible assorted problems. [...] > gandalfthegreat:~# dd if=/dev/mapper/ssdcrypt of=/dev/null bs=1M count=1024 > 1073741824 bytes (1.1 GB) copied, 44.3302 s, 24.2 MB/s > > atop shows dd isn't really pegging a single core: > THR SYSCPU USRCPU RDDSK WRDSK ST EXC S CPUNR CPU CMD > 1 0.60s 0.01s 226.2M 0K -- - D 3 6% dd It would not, as AES-NI (AFAIK) does need very little CPU assistance. AES-NI may be the problem though. Can you try with the normal AES module? I think unloading the AES-NI module may be enough for that, but I am not sure. Maybe AES-NI needs very long for something it needs to do each sector. Google("aes-ni slow") found at least some indications that aes-ni may still have problems. [...] > It shouldn't be a hardware problem since I do see 400MB/s before encryption. > > I'm a bit lost here. I'll try other kernels just in case, but that shouldn't > make a difference. It could. I remember that some time ago, quite a few people had issues with slow crypto due to some problems in the device-mapper layers. You might have hit that. Here is one benchmark from me, 1GB luks-file via /dev/loop, residing on a 3-way RAID1 with 2 HDDs as write-mostly and one older SSD. Kernel is 3.3.8, self-compiled on top of Debian squeeze, pure software AES on AMD quad-core. (Yes, I know this is complicated, but apparently works well, see below ;-) /dev/mapper/x1: Timing buffered disk reads: 612 MB in 3.00 seconds = 203.80 MB/sec And the raw SSD: /dev/sdd: Timing buffered disk reads: 618 MB in 3.01 seconds = 205.56 MB/sec Arno -- Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: arno@wagner.name GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F ---- One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision. -- Bertrand Russell ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-22 21:47 ` Arno Wagner @ 2012-07-23 6:07 ` Yves-Alexis Perez 2012-07-23 6:28 ` Marc MERLIN 1 sibling, 0 replies; 37+ messages in thread From: Yves-Alexis Perez @ 2012-07-23 6:07 UTC (permalink / raw) To: Arno Wagner; +Cc: dm-crypt [-- Attachment #1: Type: text/plain, Size: 1364 bytes --] On dim., 2012-07-22 at 23:47 +0200, Arno Wagner wrote: > On Sun, Jul 22, 2012 at 01:39:29PM -0700, Marc MERLIN wrote: > > On Sun, Jul 22, 2012 at 09:47:32PM +0200, Yves-Alexis Perez wrote: > > > > Any suggestions would be appreciated. > > > > > > I'm using Debian sid (so still at 3.2 kernel), currently using a 256G > > > Samsung SSD. What I get is: > > SID? That would be "unstable", whit possible assorted problems. *I* am running SID, not the original reporter. And I have pretty decent speed, thank you :) > > [...] > > gandalfthegreat:~# dd if=/dev/mapper/ssdcrypt of=/dev/null bs=1M count=1024 > > 1073741824 bytes (1.1 GB) copied, 44.3302 s, 24.2 MB/s > > > > atop shows dd isn't really pegging a single core: > > THR SYSCPU USRCPU RDDSK WRDSK ST EXC S CPUNR CPU CMD > > 1 0.60s 0.01s 226.2M 0K -- - D 3 6% dd > > It would not, as AES-NI (AFAIK) does need very little CPU > assistance. AES-NI may be the problem though. Can you try with > the normal AES module? I think unloading the AES-NI module > may be enough for that, but I am not sure. > > Maybe AES-NI needs very long for something it needs to do each > sector. Google("aes-ni slow") found at least some indications that > aes-ni may still have problems. And I do use aes-ni too. -- Yves-Alexis [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-22 21:47 ` Arno Wagner 2012-07-23 6:07 ` Yves-Alexis Perez @ 2012-07-23 6:28 ` Marc MERLIN 2012-07-23 8:14 ` Arno Wagner 1 sibling, 1 reply; 37+ messages in thread From: Marc MERLIN @ 2012-07-23 6:28 UTC (permalink / raw) To: dm-crypt, Arno Wagner On Sun, Jul 22, 2012 at 11:47:57PM +0200, Arno Wagner wrote: > On Sun, Jul 22, 2012 at 01:39:29PM -0700, Marc MERLIN wrote: > > On Sun, Jul 22, 2012 at 09:47:32PM +0200, Yves-Alexis Perez wrote: > > > > Any suggestions would be appreciated. > > > > > > I'm using Debian sid (so still at 3.2 kernel), currently using a 256G > > > Samsung SSD. What I get is: > > SID? That would be "unstable", whit possible assorted problems. > > [...] > > gandalfthegreat:~# dd if=/dev/mapper/ssdcrypt of=/dev/null bs=1M count=1024 > > 1073741824 bytes (1.1 GB) copied, 44.3302 s, 24.2 MB/s > > > > atop shows dd isn't really pegging a single core: > > THR SYSCPU USRCPU RDDSK WRDSK ST EXC S CPUNR CPU CMD > > 1 0.60s 0.01s 226.2M 0K -- - D 3 6% dd > > It would not, as AES-NI (AFAIK) does need very little CPU > assistance. AES-NI may be the problem though. Can you try with > the normal AES module? I think unloading the AES-NI module > may be enough for that, but I am not sure. > > Maybe AES-NI needs very long for something it needs to do each > sector. Google("aes-ni slow") found at least some indications that > aes-ni may still have problems. It was worth a shot, thanks for the suggestion. gandalfthegreat:~# lsmod | grep aes aes_x86_64 16796 34 gandalfthegreat:~# hdparm -t -T /dev/mapper/cryptroot /dev/mapper/cryptroot: Timing cached reads: 15802 MB in 2.00 seconds = 7909.98 MB/sec Timing buffered disk reads: 68 MB in 3.04 seconds = 22.39 MB/sec gandalfthegreat:~# Unfortunately, speed is exactly the same. with aes_x86_64: gandalfthegreat:/var/local/VirtualBox VMs/w2k_virtual# dd if=test of=/dev/null 4192640+0 records in 4192640+0 records out 2146631680 bytes (2.1 GB) copied, 12.4848 s, 172 MB/s I then rebooted with aesni_intel and got this: gandalfthegreat:/var/local/VirtualBox VMs/w2k_virtual# dd if=test of=/dev/null 4192640+0 records in 4192640+0 records out 2146631680 bytes (2.1 GB) copied, 8.42506 s, 255 MB/s So those speeds are expected/believable, but I still don't know why hdparm is so slow on /dev/mapper/cryptroot. Mmmh.... Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 6:28 ` Marc MERLIN @ 2012-07-23 8:14 ` Arno Wagner 2012-07-23 10:46 ` Milan Broz 2012-07-23 16:12 ` Marc MERLIN 0 siblings, 2 replies; 37+ messages in thread From: Arno Wagner @ 2012-07-23 8:14 UTC (permalink / raw) To: dm-crypt On Sun, Jul 22, 2012 at 11:28:51PM -0700, Marc MERLIN wrote: > On Sun, Jul 22, 2012 at 11:47:57PM +0200, Arno Wagner wrote: > > On Sun, Jul 22, 2012 at 01:39:29PM -0700, Marc MERLIN wrote: > > > On Sun, Jul 22, 2012 at 09:47:32PM +0200, Yves-Alexis Perez wrote: > > > > > Any suggestions would be appreciated. > > > > > > > > I'm using Debian sid (so still at 3.2 kernel), currently using a 256G > > > > Samsung SSD. What I get is: > > > > SID? That would be "unstable", whit possible assorted problems. Ok, sorry for my confusion, what kernel/distro are you running? [...] > > Maybe AES-NI needs very long for something it needs to do each > > sector. Google("aes-ni slow") found at least some indications that > > aes-ni may still have problems. > > It was worth a shot, thanks for the suggestion. > > gandalfthegreat:~# lsmod | grep aes > aes_x86_64 16796 34 > gandalfthegreat:~# hdparm -t -T /dev/mapper/cryptroot > > /dev/mapper/cryptroot: > Timing cached reads: 15802 MB in 2.00 seconds = 7909.98 MB/sec > Timing buffered disk reads: 68 MB in 3.04 seconds = 22.39 MB/sec > gandalfthegreat:~# > > Unfortunately, speed is exactly the same. > > with aes_x86_64: > gandalfthegreat:/var/local/VirtualBox VMs/w2k_virtual# dd if=test of=/dev/null > 4192640+0 records in > 4192640+0 records out > 2146631680 bytes (2.1 GB) copied, 12.4848 s, 172 MB/s > > I then rebooted with aesni_intel and got this: > gandalfthegreat:/var/local/VirtualBox VMs/w2k_virtual# dd if=test of=/dev/null > 4192640+0 records in > 4192640+0 records out > 2146631680 bytes (2.1 GB) copied, 8.42506 s, 255 MB/s That does not seem to be it. > So those speeds are expected/believable, but I still don't know why > hdparm is so slow on /dev/mapper/cryptroot. > > Mmmh.... Indeed. Arno -- Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: arno@wagner.name GnuPG: ID: 1E25338F FP: 0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F ---- One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision. -- Bertrand Russell ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 8:14 ` Arno Wagner @ 2012-07-23 10:46 ` Milan Broz 2012-07-23 11:09 ` Yves-Alexis Perez 2012-07-24 14:06 ` Heinz Diehl 2012-07-23 16:12 ` Marc MERLIN 1 sibling, 2 replies; 37+ messages in thread From: Milan Broz @ 2012-07-23 10:46 UTC (permalink / raw) To: dm-crypt >> So those speeds are expected/believable, but I still don't know why >> hdparm is so slow on /dev/mapper/cryptroot. First, hdparm basically run 2M read in direct-io mode for test, all IOs are sumbitted from one process. AES-NI helps a lot and because it is prioritised in crypto api, you usually using it without any additional config if hw supports it. (Also I see some patches whit run XTS blocks in parallel on crypto api list.) But that's not the problem you are seeing now. (I will explain it on recent kernel code, note that RHEL5/6 have older code and dmcrypt behaves slightly differently when processing requests.) Current in-kernel dmcrypt code tries to keep IO processing on CPU which submitted request. It means, that if one process generates all requests, all requests are usually processed by one core where it is this process scheduled not using other CPU cores. (Saying usually because page cache can submit request on different cpu.) So no wonder that you get slow operation - in dd/hdparm usually only one cpu is processing the request. If CPU is fast enough, no problem. If not you will see slowdown. AES-NI will speed up this on that cpu core, but will not help run request in parallel on multi-core systems. In real fs use case, more applications submitting io requests, you will get much better throughput. Also if not using directio, page cache can help and fs are usually more clever too (writes are usually better in performance - try it). I do not like this dmcrypt mode a we tried to fix it. There is a bunch of patches from Mikulas Patocka which switches parallelization to use all available cpus (if not limited by paramater). In my tests it improved performance in some cases but not in all situations (there were some slow downs which scares me). (You can see patches here http://people.redhat.com/mpatocka/patches/kernel/dm-crypt-paralelizace/) Unfortunately discussion stopped and device-mapper maintainer forgot about it. Well, your mails apparently caused some head-ups here, so I'll try to return to this. (Will post them to dm-devel directly this time.) I think fast SSD case is exactly situation where it should help. We eill need some testers though :) (I have currently no fast SSD available for testing.) Milan ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 10:46 ` Milan Broz @ 2012-07-23 11:09 ` Yves-Alexis Perez 2012-07-23 11:37 ` Milan Broz 2012-07-24 14:06 ` Heinz Diehl 1 sibling, 1 reply; 37+ messages in thread From: Yves-Alexis Perez @ 2012-07-23 11:09 UTC (permalink / raw) To: Milan Broz; +Cc: dm-crypt, Marc MERLIN [-- Attachment #1: Type: text/plain, Size: 604 bytes --] On lun., 2012-07-23 at 12:46 +0200, Milan Broz wrote: > So no wonder that you get slow operation - in dd/hdparm usually only > one cpu is processing the request. If CPU is fast enough, no problem. > If not you will see slowdown. AES-NI will speed up this on that cpu > core, > but will not help run request in parallel on multi-core systems. Then I should see slow operations too, since I'm doing exactly the same test. My guess is that is a kernel issue somewhere, but maybe we should try a common ground (say, a grml live or some fedora live) and report back? Regards, -- Yves-Alexis [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 11:09 ` Yves-Alexis Perez @ 2012-07-23 11:37 ` Milan Broz 2012-07-23 15:08 ` André Gall 2012-07-23 17:27 ` André Gall 0 siblings, 2 replies; 37+ messages in thread From: Milan Broz @ 2012-07-23 11:37 UTC (permalink / raw) To: Yves-Alexis Perez; +Cc: dm-crypt, Marc MERLIN On 07/23/2012 01:09 PM, Yves-Alexis Perez wrote: > On lun., 2012-07-23 at 12:46 +0200, Milan Broz wrote: >> So no wonder that you get slow operation - in dd/hdparm usually only >> one cpu is processing the request. If CPU is fast enough, no problem. >> If not you will see slowdown. AES-NI will speed up this on that cpu >> core, >> but will not help run request in parallel on multi-core systems. > > Then I should see slow operations too, since I'm doing exactly the same > test. My guess is that is a kernel issue somewhere, but maybe we should > try a common ground (say, a grml live or some fedora live) and report > back? Well, yes. I explained dmcrypt part, there can be other problem. E.g. alignment (bit it _seems_ correct from mails). Reading again, 23MB/s is really slow, so yes there must be something else. Common distro env is nice but be sure you have proper crypto modules available. Also do not use Fedora rawhide, it has kernel compiled with debug tools not usable for testing. You can try start with this: 1) (this should be not problem but better check it) Check alignment of partitions. Is it aligned to SSD page size? (Aligning to 1MiB is always correct ;-) Paste fdisk -l -u /dev/sda 2) try to switch io scheduler to "noop" or "deadline" (paste lsblk -t) You can do it online for sda (again, check with lsblk -t): echo "noop">/sys/block/sda/queue/scheduler Also you can try to increase queue size. (Hard core version is to run blktrace and check if request are not split unnecessarily :) 3) Let's test cipher_null (no encryption just fake-copy) (you need cryptsetup 1.4.3 for this test). create test LUKS device with null cipher: cryptsetup luksFormat -c null Repeat tests now - is the problem the same? (please send output). (For dmcrypt device speed should be only slighly slower.) please note: cipher null means no encryption, just use dmcrypt layer, so do not use for valid data :-) 4) which aes module are you using? check lsmod, check /proc/crypto you should use either aes-ni or optimized modules (x86_64 etc) Milan p.s. sorry for removing cc on first reply, my mistake. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 11:37 ` Milan Broz @ 2012-07-23 15:08 ` André Gall 2012-07-23 17:27 ` André Gall 1 sibling, 0 replies; 37+ messages in thread From: André Gall @ 2012-07-23 15:08 UTC (permalink / raw) To: dm-crypt I remember a test which I conducted after I received my Intel SDD 510 120 GB about 1 year ago. I don't remember all the details, but the essence was that write speed decreased dramatically if the write process wasn't aligned. I did: dd if=/dev/zero of=/dev/sda bs=4k I got around 400 Mb/sec, which was expected. But when I did: dd if=/dev/zero of=/dev/sda bs=5k I got only around 40 Mb/sec When you write in 5k blocks, the SSD has to do a lot of overhead because of the Read-Modify-Write operations. I guess this might also be the case if you write in 4k blocks, but don't align them to the SSD's "native" blocks. So I would check for any misalignment. Regards, André ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 11:37 ` Milan Broz 2012-07-23 15:08 ` André Gall @ 2012-07-23 17:27 ` André Gall 1 sibling, 0 replies; 37+ messages in thread From: André Gall @ 2012-07-23 17:27 UTC (permalink / raw) To: dm-crypt I remember a test which I conducted after I received my Intel SDD 510 120 GB about 1 year ago. I don't remember all the details, but the essence was that write speed decreased dramatically if the write process wasn't aligned. I did: dd if=/dev/zero of=/dev/sda bs=4k I got around 400 Mb/sec, which was expected. But when I did: dd if=/dev/zero of=/dev/sda bs=5k I got only around 40 Mb/sec When you write in 5k blocks, the SSD has to do a lot of overhead because of the Read-Modify-Write operations. I guess this might also be the case if you write in 4k blocks, but don't align them to the SSD's "native" blocks. So I would check for any misalignment. Regards, André ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 10:46 ` Milan Broz 2012-07-23 11:09 ` Yves-Alexis Perez @ 2012-07-24 14:06 ` Heinz Diehl 2012-07-24 14:16 ` Milan Broz 1 sibling, 1 reply; 37+ messages in thread From: Heinz Diehl @ 2012-07-24 14:06 UTC (permalink / raw) To: Milan Broz; +Cc: dm-crypt On 23.07.2012, Milan Broz wrote: > I do not like this dmcrypt mode a we tried to fix it. There is a bunch of patches > from Mikulas Patocka which switches parallelization to use all available > cpus (if not limited by paramater). > In my tests it improved performance in some cases but not in all situations > (there were some slow downs which scares me). > (You can see patches here http://people.redhat.com/mpatocka/patches/kernel/dm-crypt-paralelizace/) This is definitely the way to go, in the age of multicore-systems. Some of the patches are already included in linux-3.5, and the rest needs to be rebased on top of it. I'd like to try the whole series on a quadcore testing machine, but I'm not familiar with the code in most of the patches, and only one single wrong merge could lead to wrong conclusions. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-24 14:06 ` Heinz Diehl @ 2012-07-24 14:16 ` Milan Broz 0 siblings, 0 replies; 37+ messages in thread From: Milan Broz @ 2012-07-24 14:16 UTC (permalink / raw) To: dm-crypt On 07/24/2012 04:06 PM, Heinz Diehl wrote: > On 23.07.2012, Milan Broz wrote: > >> I do not like this dmcrypt mode a we tried to fix it. There is a bunch of patches >> from Mikulas Patocka which switches parallelization to use all available >> cpus (if not limited by paramater). >> In my tests it improved performance in some cases but not in all situations >> (there were some slow downs which scares me). >> (You can see patches here http://people.redhat.com/mpatocka/patches/kernel/dm-crypt-paralelizace/) > > This is definitely the way to go, in the age of > multicore-systems. Some of the patches are already included in > linux-3.5, and the rest needs to be rebased on top of it. I'd like to > try the whole series on a quadcore testing machine, but I'm not > familiar with the code in most of the patches, and only one single > wrong merge could lead to wrong conclusions. You can use my working repo, it can be applied cleanly on top of 3.5 http://mbroz.fedorapeople.org/dm-crypt/parallel/ But I run just very simple test, there can be mistakes. (There are still a lot of questions and maybe some ideas will not be implemented this way, Anyway, if you can prove it helps in your config, let me know.) Milan ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 8:14 ` Arno Wagner 2012-07-23 10:46 ` Milan Broz @ 2012-07-23 16:12 ` Marc MERLIN 2012-07-23 16:19 ` Yves-Alexis Perez 2012-07-23 17:15 ` Milan Broz 1 sibling, 2 replies; 37+ messages in thread From: Marc MERLIN @ 2012-07-23 16:12 UTC (permalink / raw) To: dm-crypt, Milan Broz; +Cc: Yves-Alexis Perez On Mon, Jul 23, 2012 at 10:14:07AM +0200, Arno Wagner wrote: > > > SID? That would be "unstable", whit possible assorted problems. > > Ok, sorry for my confusion, what kernel/distro are you running? Debian testing with pieces of unstable :) That gives me cryptsetup 1.4.3 (but debian unstable is often not more unstable than your released fedora core or ubuntu in my experience) Hi Milan, Thanks for the answer and all your questions. On Mon, Jul 23, 2012 at 12:46:38PM +0200, Milan Broz wrote: > AES-NI helps a lot and because it is prioritised in crypto api, > you usually using it without any additional config if hw supports it. > (Also I see some patches whit run XTS blocks in parallel on crypto api list.) Yes, the modules all worked perfectly and the correct one was prioritized. > So no wonder that you get slow operation - in dd/hdparm usually only > one cpu is processing the request. If CPU is fast enough, no problem. > If not you will see slowdown. AES-NI will speed up this on that cpu core, > but will not help run request in parallel on multi-core systems. Obviously, now that I've already verified that dd is slow while filesystem operations are almost 10x faster, you're obviously onto something here. But, I'm confused, why does atop show that dd is only using 6% CPU? Oooh, it's not that my CPU is fully used, it's just that my CPU is able to decrypt as quickly as the data is coming in for a 100MB/s hard drive, but not a 500MB/s SSD and however scheduling works if the data is coming in faster than the CPU can decode those pages. (editted: actually no, using 'null' encryptino still gives 25MB/s). > I do not like this dmcrypt mode a we tried to fix it. There is a bunch of patches > from Mikulas Patocka which switches parallelization to use all available > cpus (if not limited by paramater). > In my tests it improved performance in some cases but not in all situations > (there were some slow downs which scares me). > (You can see patches here http://people.redhat.com/mpatocka/patches/kernel/dm-crypt-paralelizace/) > > Unfortunately discussion stopped and device-mapper maintainer forgot about it. > > Well, your mails apparently caused some head-ups here, so I'll try to return > to this. (Will post them to dm-devel directly this time.) I appreciate your answer and your looking into this. Since I run recent self compiled kernel.org kernels, I can test patches as long as they're reasonably certain not to turn my data into garbage :) (I have backups, but it just too me too long to rebuild my laptop after my last SSD crash). On Mon, Jul 23, 2012 at 01:37:26PM +0200, Milan Broz wrote: > Common distro env is nice but be sure you have proper crypto modules available. > Also do not use Fedora rawhide, it has kernel compiled with debug tools > not usable for testing. Mmmh, I have one possible thing. I have a preempt kernel. Could that be it? http://marc.merlins.org/tmp/config-3.4.4-amd64-preempt.txt > You can try start with this: > > 1) (this should be not problem but better check it) > Check alignment of partitions. Is it aligned to SSD page size? > (Aligning to 1MiB is always correct ;-) > Paste fdisk -l -u /dev/sda Disk /dev/sda: 512.1 GB, 512110190592 bytes 255 heads, 63 sectors/track, 62260 cylinders, total 1000215216 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x09aaf50a Device Boot Start End Blocks Id System /dev/sda1 * 2048 502047 250000 83 Linux /dev/sda2 502048 52930847 26214400 83 Linux /dev/sda3 52930848 73902367 10485760 82 Linux swap / Solaris /dev/sda4 73902368 1000215215 463156424 83 Linux I also used: cryptsetup luksFormat --align-payload=8192 > 2) try to switch io scheduler to "noop" or "deadline" > (paste lsblk -t) I tried both noop and deadline (never used cfq) and it didn't help. gandalfthegreat:~# lsblk -t NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE sda 0 512 0 512 512 0 deadline 128 ├─sda1 0 512 0 512 512 0 deadline 128 ├─sda2 0 512 0 512 512 0 deadline 128 ├─sda3 0 512 0 512 512 0 deadline 128 └─sda4 0 512 0 512 512 0 deadline 128 └─cryptroot (dm-0) 0 512 0 512 512 0 128 But just to make sure, I tried cfg, noop, and deadline and it didn't make a difference. > Also you can try to increase queue size. Not sure which one it is: gandalfthegreat:/sys/block/sda/queue# grep . * add_random:1 discard_granularity:512 discard_max_bytes:2147450880 discard_zeroes_data:0 hw_sector_size:512 iostats:1 logical_block_size:512 max_hw_sectors_kb:32767 max_integrity_segments:0 max_sectors_kb:512 max_segments:168 max_segment_size:65536 minimum_io_size:512 nomerges:0 nr_requests:128 optimal_io_size:0 physical_block_size:512 read_ahead_kb:128 rotational:0 rq_affinity:1 scheduler:[noop] deadline cfq > (Hard core version is to run blktrace and check if request are not split > unnecessarily :) I'm not too sure how to read the output, but there it is: http://marc.merlins.org/tmp/blktrace.txt Generated with: gandalfthegreat:~# reset_cache ; dd if=/dev/mapper/cryptroot of=/dev/null bs=1M count=10; killall blktrace 10+0 records in 10+0 records out 10485760 bytes (10 MB) copied, 0.514707 s, 20.4 MB/s > 3) Let's test cipher_null (no encryption just fake-copy) > (you need cryptsetup 1.4.3 for this test). > > create test LUKS device with null cipher: cryptsetup luksFormat -c null gandalfthegreat:~# cryptsetup luksFormat --align-payload=8192 -c null /dev/sda2 Are you sure? (Type uppercase yes): YES Enter LUKS passphrase: Verify passphrase: gandalfthegreat:~# cryptsetup luksOpen /dev/sda2 test Enter passphrase for /dev/sda2: gandalfthegreat:~# hdparm -t -T /dev/mapper/test /dev/mapper/test: Timing cached reads: 12932 MB in 2.00 seconds = 6471.89 MB/sec Timing buffered disk reads: 76 MB in 3.04 seconds = 25.01 MB/sec gandalfthegreat:~# So it's not related to the kind of encryption. > please note: cipher null means no encryption, just use dmcrypt layer, > so do not use for valid data :-) Yes, I figured :) > 4) which aes module are you using? check lsmod, check /proc/crypto > > you should use either aes-ni or optimized modules (x86_64 etc) Yep, I tried using both (aes-ni is default) and result was the same. I'll try rebuilding a non preempt kernel just in case. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 16:12 ` Marc MERLIN @ 2012-07-23 16:19 ` Yves-Alexis Perez 2012-07-23 17:54 ` Marc MERLIN 2012-07-23 17:15 ` Milan Broz 1 sibling, 1 reply; 37+ messages in thread From: Yves-Alexis Perez @ 2012-07-23 16:19 UTC (permalink / raw) To: Marc MERLIN; +Cc: dm-crypt, Milan Broz [-- Attachment #1: Type: text/plain, Size: 685 bytes --] On lun., 2012-07-23 at 09:12 -0700, Marc MERLIN wrote: > On Mon, Jul 23, 2012 at 10:14:07AM +0200, Arno Wagner wrote: > > > > SID? That would be "unstable", whit possible assorted problems. > > > > Ok, sorry for my confusion, what kernel/distro are you running? > > Debian testing with pieces of unstable :) > That gives me cryptsetup 1.4.3 > (but debian unstable is often not more unstable than your released fedora > core or ubuntu in my experience) > […] > I'll try rebuilding a non preempt kernel just in case. Maybe try with Debian kernel first? It's just an apt-get install away and will give you a setup identical as mine. Regards, -- Yves-Alexis [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 16:19 ` Yves-Alexis Perez @ 2012-07-23 17:54 ` Marc MERLIN 2012-07-23 19:26 ` Yves-Alexis Perez 0 siblings, 1 reply; 37+ messages in thread From: Marc MERLIN @ 2012-07-23 17:54 UTC (permalink / raw) To: Yves-Alexis Perez, André Gall; +Cc: dm-crypt On Mon, Jul 23, 2012 at 06:19:10PM +0200, Yves-Alexis Perez wrote: > > I'll try rebuilding a non preempt kernel just in case. > > Maybe try with Debian kernel first? It's just an apt-get install away > and will give you a setup identical as mine. I have a custom initramfs, it was easier to compile a new kernel than to try and get the debian one to boot my system :) But for reference, what kernel package are you using? On Mon, Jul 23, 2012 at 07:27:47PM +0200, André Gall wrote: > I remember a test which I conducted after I received my Intel SDD 510 > 120 GB about 1 year ago. I don't remember all the details, but the > essence was that write speed decreased dramatically if the write process > wasn't aligned. > I did: > dd if=/dev/zero of=/dev/sda bs=4k > I got around 400 Mb/sec, which was expected. But when I did: > > dd if=/dev/zero of=/dev/sda bs=5k > I got only around 40 Mb/sec Yes, write alignment and write amplification are definitely issues on SSDs, but note how I was only doing read tests :) See http://www.void.gr/kargig/blog/2012/01/11/linux-ssd-partition-alignment-tips/ Cheers, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 17:54 ` Marc MERLIN @ 2012-07-23 19:26 ` Yves-Alexis Perez 0 siblings, 0 replies; 37+ messages in thread From: Yves-Alexis Perez @ 2012-07-23 19:26 UTC (permalink / raw) To: Marc MERLIN; +Cc: dm-crypt, André Gall [-- Attachment #1: Type: text/plain, Size: 737 bytes --] On lun., 2012-07-23 at 10:54 -0700, Marc MERLIN wrote: > On Mon, Jul 23, 2012 at 06:19:10PM +0200, Yves-Alexis Perez wrote: > > > I'll try rebuilding a non preempt kernel just in case. > > > > Maybe try with Debian kernel first? It's just an apt-get install > away > > and will give you a setup identical as mine. > > I have a custom initramfs, it was easier to compile a new kernel than > to try > and get the debian one to boot my system :) > > But for reference, what kernel package are you using? Linux version 3.2.0-3-amd64 (Debian 3.2.21-3) (debian-kernel@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-8) ) #1 SMP Thu Jun 28 09:07:26 UTC 2012 And I'll upgrade to 3.2.23-1 just know. -- Yves-Alexis [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 16:12 ` Marc MERLIN 2012-07-23 16:19 ` Yves-Alexis Perez @ 2012-07-23 17:15 ` Milan Broz 2012-07-23 17:51 ` Marc MERLIN 1 sibling, 1 reply; 37+ messages in thread From: Milan Broz @ 2012-07-23 17:15 UTC (permalink / raw) To: Marc MERLIN; +Cc: dm-crypt, Yves-Alexis Perez On 07/23/2012 06:12 PM, Marc MERLIN wrote: > On Mon, Jul 23, 2012 at 10:14:07AM +0200, Arno Wagner wrote: > (editted: actually no, using 'null' encryptino still gives 25MB/s). Ok, the we can forget about aes-ni etc, it is problem somewhere else. > Mmmh, I have one possible thing. I have a preempt kernel. Could that be it? > http://marc.merlins.org/tmp/config-3.4.4-amd64-preempt.txt Can you send me your kernel .config then? Preempt should not be problem. Which kernel version? which architecture? Any additional patches over mainline code? >> Paste fdisk -l -u /dev/sda > > Disk /dev/sda: 512.1 GB, 512110190592 bytes > 255 heads, 63 sectors/track, 62260 cylinders, total 1000215216 sectors > Units = sectors of 1 * 512 = 512 bytes > Sector size (logical/physical): 512 bytes / 512 bytes > I/O size (minimum/optimal): 512 bytes / 512 bytes > Disk identifier: 0x09aaf50a > > Device Boot Start End Blocks Id System > /dev/sda1 * 2048 502047 250000 83 Linux > /dev/sda2 502048 52930847 26214400 83 Linux > /dev/sda3 52930848 73902367 10485760 82 Linux swap / Solaris > /dev/sda4 73902368 1000215215 463156424 83 Linux Hm. sda1 is apparently aligned. But I think other partitions are not aligned properly. No idea which block size/page size your SSD internaly use, but to be safe, let's assume ideal is 256KiB (= 512 * 512 byte sectors). 73902368 seems not to be multiple of 512... Well, your sda2 configuration is created with +256MB, which is SI unit, so not multiple of 1024 but 1000! It should be created with +256M, so you end with: Device Boot Start End Blocks Id System /dev/sda1 2048 526335 262144 83 Linux /dev/sda2 526336 ... But I think this is not the core problem but you should try it to fix. (I hope I did not mixed up numbers above :) ... Thinking about it, we can test it without partition change (for reads). This test is perhaps useless but maybe someone can find it useful later. Let's map properly aligned device to sda4 (read only). Result will be just garbage, but it is just for speed testing. For mapping above, properly aligned (1MiB alignment = 2048 sectors) should start on sector 7390412. This is +1760 sector shift for sda4. Map 1GiB device there: # dmsetup create test_plain -r --table "0 2097152 linear /dev/sda4 1760" Map null cryptsetup over it # echo "password" | cryptsetup create -c null test_crypt /dev/mapper/test_plain Now repeat read dd test for /dev/mapper/test_plain and /dev/mapper/test_crypt Still the same slow down? Remove devices # dmsetup remove test_crypt # dmsetup remove test_plain (Btw if you use real cipher and zero device instead of underlying disk, you can measure encryption throughput: # dmsetup create test_plain --table "0 2097152 zero" # echo "password" | cryptsetup create -c aes-xts-plain64 -s 256 test_crypt /dev/mapper/test_plain ... very useful to prove that your cpu encryption is slow... but we proved that it is not the case here, so just for the archive :-) > I also used: > cryptsetup luksFormat --align-payload=8192 It will not help if underlying partition is misaligned. >> 2) try to switch io scheduler to "noop" or "deadline" > But just to make sure, I tried cfg, noop, and deadline and it didn't make a > difference. ok >> Also you can try to increase queue size. > > Not sure which one it is: I think this one, try to increase it to 8192 or so... > nr_requests:128 Milan ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 17:15 ` Milan Broz @ 2012-07-23 17:51 ` Marc MERLIN 2012-07-23 21:31 ` Milan Broz 0 siblings, 1 reply; 37+ messages in thread From: Marc MERLIN @ 2012-07-23 17:51 UTC (permalink / raw) To: Milan Broz, Yves-Alexis Perez; +Cc: dm-crypt On Mon, Jul 23, 2012 at 07:15:24PM +0200, Milan Broz wrote: > > Mmmh, I have one possible thing. I have a preempt kernel. Could that be it? > > http://marc.merlins.org/tmp/config-3.4.4-amd64-preempt.txt > > Can you send me your kernel .config then? Preempt should not be problem. > Which kernel version? which architecture? Any additional patches over > mainline code? I just sent you my config, it was in the URL above :) No patches, kernel 3.4.4 from kernel.org, see above. I tried without preempt and with volprempt and duplicated the same slow speeds. > > Disk /dev/sda: 512.1 GB, 512110190592 bytes > > 255 heads, 63 sectors/track, 62260 cylinders, total 1000215216 sectors > > Units = sectors of 1 * 512 = 512 bytes > > Sector size (logical/physical): 512 bytes / 512 bytes > > I/O size (minimum/optimal): 512 bytes / 512 bytes > > Disk identifier: 0x09aaf50a > > > > Device Boot Start End Blocks Id System > > /dev/sda1 * 2048 502047 250000 83 Linux > > /dev/sda2 502048 52930847 26214400 83 Linux > > /dev/sda3 52930848 73902367 10485760 82 Linux swap / Solaris > > /dev/sda4 73902368 1000215215 463156424 83 Linux > > Hm. sda1 is apparently aligned. But I think > other partitions are not aligned properly. Really? I used fdisk -H32 -S32 /dev/sda as recomended on http://www.void.gr/kargig/blog/2012/01/11/linux-ssd-partition-alignment-tips/ > No idea which block size/page size your SSD internaly use, but to be safe, > let's assume ideal is 256KiB (= 512 * 512 byte sectors). > 73902368 seems not to be multiple of 512... > > Well, your sda2 configuration is created with +256MB, which is SI unit, > so not multiple of 1024 but 1000! Oh my, I can't believe they changed that (been using linux since 1993 back then +1MB in fdisk was a real MB). > It should be created with +256M, so you end with: > Device Boot Start End Blocks Id System > /dev/sda1 2048 526335 262144 83 Linux > /dev/sda2 526336 ... > > But I think this is not the core problem but you should try it to fix. > (I hope I did not mixed up numbers above :) It would be a problem for erase blocks if it is wrong, but not for reading, especially when I can do filesystem reads at 260MB/s on the filesystem created on the encrypted device. But I'm confused, I thought fdisk -H32 -S32 would protect me from mis-alignment and fdisk/util-linux 2.20.1 was smart enough to use proper boundaries on its own on SSDs anyway. > ... Thinking about it, we can test it without partition change (for reads). > > This test is perhaps useless but maybe someone can find it useful later. > > Let's map properly aligned device to sda4 (read only). Result will be just > garbage, but it is just for speed testing. > > For mapping above, properly aligned (1MiB alignment = 2048 sectors) > should start on sector 7390412. This is +1760 sector shift for sda4. > > Map 1GiB device there: > # dmsetup create test_plain -r --table "0 2097152 linear /dev/sda4 1760" > > Map null cryptsetup over it > # echo "password" | cryptsetup create -c null test_crypt /dev/mapper/test_plain > > Now repeat read dd test for /dev/mapper/test_plain and /dev/mapper/test_crypt Same thing: gandalfthegreat:~# dmsetup create test_plain -r --table "0 2097152 linear /dev/sda4 1760" gandalfthegreat:~# echo "password" | cryptsetup create -c null test_crypt /dev/mapper/test_plain gandalfthegreat:~# dmsetup status test_crypt 0 2097152 crypt gandalfthegreat:~# cryptsetup status test_crypt /dev/mapper/test_crypt is active. type: PLAIN cipher: cipher_null-ecb keysize: 0 bits device: /dev/mapper/test_plain offset: 0 sectors size: 2097152 sectors mode: read/write gandalfthegreat:~# hdparm -t -T /dev/mapper/test_plain /dev/mapper/test_plain: Timing cached reads: 13978 MB in 2.00 seconds = 6997.30 MB/sec Timing buffered disk reads: 958 MB in 3.00 seconds = 319.00 MB/sec gandalfthegreat:~# hdparm -t -T /dev/mapper/test_crypt /dev/mapper/test_crypt: Timing cached reads: 15434 MB in 2.00 seconds = 7725.50 MB/sec Timing buffered disk reads: 76 MB in 3.03 seconds = 25.05 MB/sec gandalfthegreat:~# > > I also used: > > cryptsetup luksFormat --align-payload=8192 > > It will not help if underlying partition is misaligned. That's correct of course :) Now I'm going to have to do some more reading to see whether I'm really misaligned or not. I had done all the relevant reading on SSDs to make sure I did use proper alignment. I wonder why fdisk -H32 -S32 failed me. > I think this one, try to increase it to 8192 or so... > > > nr_requests:128 Didn't help. gandalfthegreat:/sys/block/sda/queue# echo 8192 > nr_requests gandalfthegreat:/sys/block/sda/queue# hdparm -t -T /dev/mapper/cryptroot /dev/mapper/cryptroot: Timing cached reads: 14970 MB in 2.00 seconds = 7492.34 MB/sec Timing buffered disk reads: 72 MB in 3.08 seconds = 23.35 MB/sec gandalfthegreat:/sys/block/sda/queue# Thanks for the other suggestions. Hopefully we'll nail this somehow :) Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 17:51 ` Marc MERLIN @ 2012-07-23 21:31 ` Milan Broz 2012-07-24 5:57 ` Marc MERLIN 2012-07-24 6:11 ` Heinz Diehl 0 siblings, 2 replies; 37+ messages in thread From: Milan Broz @ 2012-07-23 21:31 UTC (permalink / raw) To: Marc MERLIN; +Cc: dm-crypt, Yves-Alexis Perez On 07/23/2012 07:51 PM, Marc MERLIN wrote: > On Mon, Jul 23, 2012 at 07:15:24PM +0200, Milan Broz wrote: >>> Mmmh, I have one possible thing. I have a preempt kernel. Could that be it? >>> http://marc.merlins.org/tmp/config-3.4.4-amd64-preempt.txt >> >> Can you send me your kernel .config then? Preempt should not be problem. >> Which kernel version? which architecture? Any additional patches over >> mainline code? > > I just sent you my config, it was in the URL above :) > No patches, kernel 3.4.4 from kernel.org, see above. Ehm... sorry, I completely missed that. Thanks. > Really? > I used fdisk -H32 -S32 /dev/sda as recomended on > http://www.void.gr/kargig/blog/2012/01/11/linux-ssd-partition-alignment-tips/ Do not use -H32 -S32. It is crazy and obsolete way how to align it... Someone is wrong in the internet seems http://xkcd.com/386/ ;-) Disk driver should set topology parameters which fdisk uses. But for your case all is set to 512 bytes... Whatever, there was a bug in fdisk, fixed now thanks to your report :) https://github.com/karelzak/util-linux/commit/c0175e6185ac81843cbad33cbea75abd033f0e66 > Thanks for the other suggestions. Hopefully we'll nail this somehow :) Well, please try some default distro compiled kernel if you can reproduce it there. Milan ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 21:31 ` Milan Broz @ 2012-07-24 5:57 ` Marc MERLIN 2012-07-24 6:25 ` Heinz Diehl 2012-07-24 13:54 ` Milan Broz 2012-07-24 6:11 ` Heinz Diehl 1 sibling, 2 replies; 37+ messages in thread From: Marc MERLIN @ 2012-07-24 5:57 UTC (permalink / raw) To: Milan Broz; +Cc: dm-crypt, Yves-Alexis Perez On Mon, Jul 23, 2012 at 11:31:28PM +0200, Milan Broz wrote: > On 07/23/2012 07:51 PM, Marc MERLIN wrote: > > On Mon, Jul 23, 2012 at 07:15:24PM +0200, Milan Broz wrote: > >>> Mmmh, I have one possible thing. I have a preempt kernel. Could that be it? > >>> http://marc.merlins.org/tmp/config-3.4.4-amd64-preempt.txt > >> > >> Can you send me your kernel .config then? Preempt should not be problem. > >> Which kernel version? which architecture? Any additional patches over > >> mainline code? > > > > I just sent you my config, it was in the URL above :) > > No patches, kernel 3.4.4 from kernel.org, see above. > > Ehm... sorry, I completely missed that. Thanks. Mmmh, so I installed "standard" linux-image-3.2.0-3-amd64 from debian. And.... nothing changed :( /dev/mapper/cryptroot: Timing cached reads: 19642 MB in 2.00 seconds = 9833.88 MB/sec Timing buffered disk reads: 72 MB in 3.05 seconds = 23.59 MB/sec Did you find anything more useful here: http://marc.merlins.org/tmp/blktrace.txt Or can I take another blktrace that helps? > > Really? > > I used fdisk -H32 -S32 /dev/sda as recomended on > > http://www.void.gr/kargig/blog/2012/01/11/linux-ssd-partition-alignment-tips/ > > Do not use -H32 -S32. It is crazy and obsolete way how to align it... > Someone is wrong in the internet seems http://xkcd.com/386/ ;-) Yes, I know the cartoon :) Mmmh, so I'll have to reformat everything so that all my partition start numbers are multiple of 512. Maybe I can get parted to move at least partition #4 without losing all my data. I'll try that. However is using cryptsetup luksFormat --align-payload=8192 still the right thing for me? (with the understanding that alignment shouldn't really an issue for reads, but for writes) > Disk driver should set topology parameters which fdisk uses. But for your > case all is set to 512 bytes... > > Whatever, there was a bug in fdisk, fixed now thanks to your report :) > https://github.com/karelzak/util-linux/commit/c0175e6185ac81843cbad33cbea75abd033f0e66 Cool, thanks for that. Ok, so I repartitioned my first 3 partitions, which I could do without losing data: Disk /dev/sda: 512.1 GB, 512110190592 bytes 255 heads, 63 sectors/track, 62260 cylinders, total 1000215216 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x09aaf50a Device Boot Start End Blocks Id System /dev/sda1 * 2048 502271 250112 83 Linux /dev/sda2 502272 52930559 26214144 83 Linux /dev/sda3 52930560 73902079 10485760 82 Linux swap / Solaris /dev/sda4 73902368 1000215215 463156424 83 Linux OMG, that actually helped: gandalfthegreat:~# echo test | cryptsetup create -c null test_crypt /dev/sda2 gandalfthegreat:~# hdparm -t -T /dev/mapper/test_crypt /dev/mapper/test_crypt: Timing cached reads: 18186 MB in 2.00 seconds = 9103.83 MB/sec Timing buffered disk reads: 524 MB in 3.04 seconds = 172.63 MB/sec But with LUKS, it falls apart: gandalfthegreat:~# cryptsetup luksFormat --align-payload=8192 -c aes-xts-plain -s 256 /dev/sda2 (...) gandalfthegreat:~# cryptsetup luksOpen --allow-discards /dev/sda2 test (...) gandalfthegreat:~# hdparm -t -T /dev/mapper/test /dev/mapper/test: Timing cached reads: 17436 MB in 2.00 seconds = 8728.61 MB/sec Timing buffered disk reads: 74 MB in 3.03 seconds = 24.44 MB/sec Grumble. So 1) alignement has some effect and I'm not sure how to get luksFormat aligned right 2) Even in the better case above, 172.63 MB/sec is too slow. I was getting faster speeds by dding a file from potentially the encrypted filesystem. If blktrace doesn't show anything useful, is there anything else I can capture? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-24 5:57 ` Marc MERLIN @ 2012-07-24 6:25 ` Heinz Diehl 2012-07-24 15:02 ` Marc MERLIN 2012-07-24 13:54 ` Milan Broz 1 sibling, 1 reply; 37+ messages in thread From: Heinz Diehl @ 2012-07-24 6:25 UTC (permalink / raw) To: dm-crypt On 24.07.2012, Marc MERLIN wrote: > Timing buffered disk reads: 72 MB in 3.05 seconds = 23.59 MB/sec > Units = sectors of 1 * 512 = 512 bytes > Sector size (logical/physical): 512 bytes / 512 bytes > I/O size (minimum/optimal): 512 bytes / 512 bytes Please correct me if I should be wrong, but your drive should report 512/4096 here, so it lies about the real blocksize it uses (4k). This raises the question if you have created your filesystem on top of the encrypted partition with e.g. "-b 4096". ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-24 6:25 ` Heinz Diehl @ 2012-07-24 15:02 ` Marc MERLIN 2012-07-24 15:19 ` Milan Broz 0 siblings, 1 reply; 37+ messages in thread From: Marc MERLIN @ 2012-07-24 15:02 UTC (permalink / raw) To: dm-crypt, Milan Broz; +Cc: Yves-Alexis Perez On Tue, Jul 24, 2012 at 08:25:18AM +0200, Heinz Diehl wrote: > Please correct me if I should be wrong, but your drive should report > 512/4096 here, so it lies about the real blocksize it uses (4k). > This raises the question if you have created your filesystem on top of > the encrypted partition with e.g. "-b 4096". I'm using btrfs, which defaults to 4K blocks. Also, I was I seeing 270MB/s reading a big file with btrfs on top of cryptroot. On Tue, Jul 24, 2012 at 10:44:36AM +0200, Milan Broz wrote: > Seems I am running out of ideas :) > (I just read the mails again and I think I am missing something > obvious. Whatever, I will return to it later.) I wanted to command you for not giving up, you definitely went the extra mile :) > btw why is in blktrace process generating IO in log [bash] > and not [dd] or [hdparm] ? I noticed that. I used dd that time so that I could force an entire GB of data, and I was also surprised that it showed up in bash when indeed I did: gandalfthegreat:~# dd if=/dev/mapper/cryptroot of=/dev/null bs=1M count=1024 > Apparently you generated it with dd. (But maybe just old blktrace), > I have # blktrace --version > blktrace version 2.0.0 Same version here. On Tue, Jul 24, 2012 at 03:54:06PM +0200, Milan Broz wrote: > So. Can you please try to increase readahead (and also run it with direct-io)? > Just to check if it is the same problem or not... So, direct IO is faster but not as fast as it should, and .... readahead fixes it! gandalfthegreat:~# hdparm --direct -t /dev/mapper/cryptroot /dev/mapper/cryptroot: Timing O_DIRECT disk reads: 242 MB in 3.02 seconds = 80.26 MB/sec gandalfthegreat:~# dd if=/dev/mapper/cryptroot of=/dev/null bs=1M count=1024 iflag=direct 1073741824 bytes (1.1 GB) copied, 18.4517 s, 58.2 MB/s gandalfthegreat:~# blockdev --setra 8192 /dev/mapper/cryptroot gandalfthegreat:~# hdparm --direct -t /dev/mapper/cryptroot /dev/mapper/cryptroot: Timing O_DIRECT disk reads: 256 MB in 3.01 seconds = 85.10 MB/sec gandalfthegreat:~# dd if=/dev/mapper/cryptroot of=/dev/null bs=1M count=1024 iflag=direct 1073741824 bytes (1.1 GB) copied, 16.7627 s, 64.1 MB/s But non dirct IO is now fast: gandalfthegreat:~# hdparm -t /dev/mapper/cryptroot /dev/mapper/cryptroot: Timing buffered disk reads: 784 MB in 3.00 seconds = 261.08 MB/sec gandalfthegreat:~# dd if=/dev/mapper/cryptroot of=/dev/null bs=1M count=1024 1073741824 bytes (1.1 GB) copied, 3.82117 s, 281 MB/s A big thank you helping me track this down. Obviously for now I'll make sure I have this in my initscripts, and while your direct IO is much faster than mine (weird), it looks like you can reproduce this too in the non direct IO case at least. Hopefully that'll help fix the kernel. Feel free to ping me if there are kernel patches you'd like me to try before submission. Thanks again, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-24 15:02 ` Marc MERLIN @ 2012-07-24 15:19 ` Milan Broz 2012-07-24 16:09 ` Marc MERLIN 0 siblings, 1 reply; 37+ messages in thread From: Milan Broz @ 2012-07-24 15:19 UTC (permalink / raw) To: Marc MERLIN; +Cc: dm-crypt, Yves-Alexis Perez On 07/24/2012 05:02 PM, Marc MERLIN wrote: > On Tue, Jul 24, 2012 at 08:25:18AM +0200, Heinz Diehl wrote: >> Please correct me if I should be wrong, but your drive should report >> 512/4096 here, so it lies about the real blocksize it uses (4k). >> This raises the question if you have created your filesystem on top of >> the encrypted partition with e.g. "-b 4096". > > I'm using btrfs, which defaults to 4K blocks. Also, I was I seeing 270MB/s > reading a big file with btrfs on top of cryptroot. > > On Tue, Jul 24, 2012 at 10:44:36AM +0200, Milan Broz wrote: >> Seems I am running out of ideas :) >> (I just read the mails again and I think I am missing something >> obvious. Whatever, I will return to it later.) > > I wanted to command you for not giving up, you definitely went the extra > mile :) :) So seems elevator completely misbehaves for SSD in some situations. I have no time to check it today but this must be fixed. Read-ahead is just stupid workaround... # echo "0">/sys/block/sdc/queue/rotational # hdparm -t /dev/mapper/sdc_null_crypt Timing buffered disk reads: 220 MB in 3.01 seconds = 73.07 MB/sec # echo "1">/sys/block/sdc/queue/rotational # hdparm -t /dev/mapper/sdc_null_crypt Timing buffered disk reads: 652 MB in 3.01 seconds = 216.75 MB/sec This SSD is quicker if set to rotational mode! (So it merges requests in fact.) Milan ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-24 15:19 ` Milan Broz @ 2012-07-24 16:09 ` Marc MERLIN 0 siblings, 0 replies; 37+ messages in thread From: Marc MERLIN @ 2012-07-24 16:09 UTC (permalink / raw) To: Milan Broz; +Cc: dm-crypt, Yves-Alexis Perez On Tue, Jul 24, 2012 at 05:19:26PM +0200, Milan Broz wrote: > I have no time to check it today but this must be fixed. Read-ahead > is just stupid workaround... > > # echo "0">/sys/block/sdc/queue/rotational > # hdparm -t /dev/mapper/sdc_null_crypt > Timing buffered disk reads: 220 MB in 3.01 seconds = 73.07 MB/sec > > # echo "1">/sys/block/sdc/queue/rotational > # hdparm -t /dev/mapper/sdc_null_crypt > Timing buffered disk reads: 652 MB in 3.01 seconds = 216.75 MB/sec > > This SSD is quicker if set to rotational mode! > (So it merges requests in fact.) That didn't help for me, strangely. gandalfthegreat:~# blockdev --setra 256 /dev/mapper/cryptroot gandalfthegreat:~# reset_cache gandalfthegreat:~# dd if=/dev/mapper/cryptroot of=/dev/null bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 43.4968 s, 24.7 MB/s gandalfthegreat:~# dmsetup ls | grep cryptroot cryptroot (254:0) gandalfthegreat:~# echo "1">/sys/block/dm-0/queue/rotational gandalfthegreat:~# reset_cache gandalfthegreat:~# dd if=/dev/mapper/cryptroot of=/dev/null bs=1M count=1024 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 42.9933 s, 25.0 MB/s Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-24 5:57 ` Marc MERLIN 2012-07-24 6:25 ` Heinz Diehl @ 2012-07-24 13:54 ` Milan Broz [not found] ` <500E9099.8050501@redhat.com> 2012-07-24 14:27 ` Heinz Diehl 1 sibling, 2 replies; 37+ messages in thread From: Milan Broz @ 2012-07-24 13:54 UTC (permalink / raw) To: Marc MERLIN; +Cc: dm-crypt, Yves-Alexis Perez On 07/24/2012 07:57 AM, Marc MERLIN wrote: > Mmmh, so I installed "standard" linux-image-3.2.0-3-amd64 from debian. > And.... > nothing changed :( Well. I found SSD where I can reproduce something similar. Seems page cache messes something with readahead... # lsblk /dev/sdc -t NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE sdc 0 512 0 512 512 0 noop 128 └─sdc_null_crypt (dm-0) 0 512 0 512 512 0 128 - Disable readahead # blockdev --setra 0 /dev/mapper/sdc_null_crypt # hdparm -t /dev/mapper/sdc_null_crypt Timing buffered disk reads: 106 MB in 3.01 seconds = 35.18 MB/sec (pretty bad) But with direct-io it still works: # hdparm --direct -t /dev/mapper/sdc_null_crypt Timing O_DIRECT disk reads: 698 MB in 3.00 seconds = 232.44 MB/sec (now with read-ahead set to 4M) # blockdev --setra 8192 /dev/mapper/sdc_null_crypt # hdparm -t /dev/mapper/sdc_null_crypt Timing buffered disk reads: 664 MB in 3.00 seconds = 221.29 MB/sec So. Can you please try to increase readahead (and also run it with direct-io)? Just to check if it is the same problem or not... Thanks, Milan ^ permalink raw reply [flat|nested] 37+ messages in thread
[parent not found: <500E9099.8050501@redhat.com>]
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-24 13:54 ` Milan Broz [not found] ` <500E9099.8050501@redhat.com> @ 2012-07-24 14:27 ` Heinz Diehl 2012-07-24 14:58 ` Heinz Diehl 1 sibling, 1 reply; 37+ messages in thread From: Heinz Diehl @ 2012-07-24 14:27 UTC (permalink / raw) To: dm-crypt On 24.07.2012, Milan Broz wrote: > - Disable readahead > # blockdev --setra 0 /dev/mapper/sdc_null_crypt This seems not only to be a problem with SSD drives, but also with conventional harddisks. I've never seen something like this, because I have had "blockdev --setra 8192 /dev/sda" in my rc.local for ages, because it's well known that it speeds up all disks significantly. Here's a short test. The partition is encrypted with 256-bit twofish-xts-plain64:sha256. [root@wildsau ~]# hdparm -i /dev/sda /dev/sda: Model=Hitachi HTS723232L9A360, FwRev=FC4OC30F, SerialNo=090119FC1400NEGNHSTD Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=DualPortCache, BuffSize=15058kB, MaxMultSect=16, MultSect=16 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=625142448 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6 AdvancedPM=yes: mode=0x80 (128) WriteCache=enabled Drive conforms to: unknown: ATA/ATAPI-2,3,4,5,6,7 With "blockdev --setra 8192" og "hdparm -tT": Timing buffered disk reads: 234 MB in 3.01 seconds = 77.69 MB/sec With "blockdev --setra 0" og "hdparm -tT": Timing cached reads: 4670 MB in 2.00 seconds = 2336.29 MB/sec Timing buffered disk reads: 14 MB in 3.06 seconds = 4.57 MB/sec ~~~~~~~~~~ With "blockdev --setra 0" og "hdparm --direct -tT": Timing O_DIRECT cached reads: 250 MB in 2.01 seconds = 124.45 MB/sec Timing O_DIRECT disk reads: 238 MB in 3.02 seconds = 78.80 MB/sec ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-24 14:27 ` Heinz Diehl @ 2012-07-24 14:58 ` Heinz Diehl 2012-07-24 15:38 ` Marc MERLIN 0 siblings, 1 reply; 37+ messages in thread From: Heinz Diehl @ 2012-07-24 14:58 UTC (permalink / raw) To: Milan Broz; +Cc: dm-crypt On 24.07.2012, Heinz Diehl wrote: [....] And that's obviously not all. The problem occurs also with _unencrypted_ partitions/devices, here's the proof (/dev/sda1 is a 500M ext2 partition mounted on /boot; using the same Hitachi rotational drive as in my previous test): [root@wildsau /]# blockdev --setra 8192 /dev/sda1 [root@wildsau /]# hdparm -t /dev/sda1 Timing buffered disk reads: 236 MB in 3.00 seconds = 78.57 MB/sec [root@wildsau /]# blockdev --setra 0 /dev/sda1 [root@wildsau /]# hdparm -t /dev/sda1 Timing buffered disk reads: 26 MB in 3.05 seconds = 8.51 MB/sec [root@wildsau /]# blockdev --setra 0 /dev/sda1 [root@wildsau /]# hdparm --direct -t /dev/sda1 Timing O_DIRECT disk reads: 236 MB in 3.02 seconds = 78.09 MB/sec All this is on F17, with a vanilla 3.5 kernel from kernel.org (with just 3 own small patches wich do not touch code involved here). ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-24 14:58 ` Heinz Diehl @ 2012-07-24 15:38 ` Marc MERLIN 2012-07-24 16:48 ` Heinz Diehl 0 siblings, 1 reply; 37+ messages in thread From: Marc MERLIN @ 2012-07-24 15:38 UTC (permalink / raw) To: Milan Broz, dm-crypt On Tue, Jul 24, 2012 at 04:58:17PM +0200, Heinz Diehl wrote: > On 24.07.2012, Heinz Diehl wrote: > > [....] > > And that's obviously not all. The problem occurs also with _unencrypted_ > partitions/devices, here's the proof (/dev/sda1 is a 500M ext2 > partition mounted on /boot; using the same Hitachi rotational drive as > in my previous test): > > [root@wildsau /]# blockdev --setra 8192 /dev/sda1 > [root@wildsau /]# hdparm -t /dev/sda1 > Timing buffered disk reads: 236 MB in 3.00 seconds = 78.57 MB/sec Funny you'd say that. I checked mine (with 3.4.4) By default, it comes with: gandalfthegreat:~# blockdev --report /dev/sda2 RO RA SSZ BSZ StartSec Size Device rw 256 512 4096 502272 26843283456 /dev/sda2 gandalfthegreat:~# blockdev --setra 0 /dev/sda2 gandalfthegreat:~# hdparm -t /dev/sda2 /dev/sda2: Timing buffered disk reads: 2 MB in 4.81 seconds = 425.77 kB/sec gandalfthegreat:~# hdparm --direct -t /dev/sda2 /dev/sda2: Timing O_DIRECT disk reads: 432 MB in 3.01 seconds = 143.68 MB/sec So I know it doens't make sense, but apparently it's SSD/machine sensitive and for me the default of 256 is enough for the block device, but not enough for a dm-crypt'ed device. I just checked that my other dmcrypt device I just created: gandalfthegreat:~# blockdev --report /dev/mapper/test RO RA SSZ BSZ StartSec Size Device rw 256 512 4096 0 26839089152 /dev/mapper/test So I do get 256 RA consistently and it's enough for the raw block device, but not enough for a dm-crypted device. gandalfthegreat:~# hdparm -t /dev/mapper/test /dev/mapper/test: Timing buffered disk reads: 66 MB in 3.05 seconds = 21.62 MB/sec By the way, just for run, I tried cryptsetup luksFormat --align-payload=8192 -c twofish-xts-plain64 /dev/sda2 instead of cryptsetup luksFormat --align-payload=8192 -c aes-xts-plain /dev/sda2 since it was suggested that it might be faster on recent laptops. Well, on a Lenovo 530 with model name : Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz I get 177 MB/s for twofish-xts-plain64 and 281 MB/s for aes-xts-plain For comparison, I got 406 MB/s for null dmcrypt. So it sounds like for my SSD, aes-xts-plain is the fastest with aesni_intel once I've run blockdev --settra 8192. I tried --settra 81920 (10x more) and it raised aes-xts-plain from 281MB/s to 298MB/s and null dmcrypt 406MB/s to 446MB/s. Hope this helps. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-24 15:38 ` Marc MERLIN @ 2012-07-24 16:48 ` Heinz Diehl 0 siblings, 0 replies; 37+ messages in thread From: Heinz Diehl @ 2012-07-24 16:48 UTC (permalink / raw) To: Marc MERLIN; +Cc: dm-crypt, Milan Broz On 24.07.2012, Marc MERLIN wrote: > So I know it doens't make sense, but apparently it's SSD/machine > sensitive and for me the default of 256 is enough for the block > device, but not enough for a dm-crypt'ed device. I guess this has nothing to do with what kind of drive you have, but the real blocksize of the device. Conventional drives with 512/512 are fine with the default, but advanced format and SSD-drives which run with a 4k blocksize need more.. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-23 21:31 ` Milan Broz 2012-07-24 5:57 ` Marc MERLIN @ 2012-07-24 6:11 ` Heinz Diehl 1 sibling, 0 replies; 37+ messages in thread From: Heinz Diehl @ 2012-07-24 6:11 UTC (permalink / raw) To: dm-crypt On 24.07.2012, Milan Broz wrote: > Disk driver should set topology parameters which fdisk uses. But for your > case all is set to 512 bytes... In case of a 4k alignment, I use "fdisk -c -u /dev/sda". The first partition will start at 2048, and all gets properly aligned as long as I'm using "+G" for defining the partitions. The last partition on a disk can be misaligned if you choose to use "the rest". Since fdisk displays the sectors in an inclusive way, the endsize fdisk is showing ("p") +1 should be divisible by 8. You can see it the other way round and say (endsize MOD 8 = 7). It's the same thing with Western Digital and Seagate drives which use "advanced format". As far as I know, proper alignment is not the whole thing: some drives doesn't report that they are using the new blocksize, and the OS can therefore not know (reporting 512/512 logical/physical blocksize, instead of 512/4096). So the filesystem which is created on top of these partitions has to be created using the 4096 blocksize: "mkfs.ext4 -b 4096 /dev/sdx", e.g. I don't know how this is handled via the dmcrypt layer.. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-22 20:39 ` Marc MERLIN 2012-07-22 21:47 ` Arno Wagner @ 2012-07-22 21:55 ` Marc MERLIN 1 sibling, 0 replies; 37+ messages in thread From: Marc MERLIN @ 2012-07-22 21:55 UTC (permalink / raw) To: Yves-Alexis Perez, dm-crypt, htd On Sun, Jul 22, 2012 at 01:39:29PM -0700, Marc MERLIN wrote: > Result is the same: > gandalfthegreat:~# hdparm -tT /dev/mapper/ssdcrypt > > /dev/mapper/ssdcrypt: > Timing cached reads: 16614 MB in 2.00 seconds = 8317.50 MB/sec > Timing buffered disk reads: 68 MB in 3.08 seconds = 22.09 MB/sec > > Grumble. > > It shouldn't be a hardware problem since I do see 400MB/s before encryption. Ok, I'm not sure I understand. I mounted /dev/mapper/ssdcrypt (btrfs) and when going through the filesystem I get much higher speeds than using dd on the raw block device. But... when I run The test below shows that I can access the encrypted SSD 10x faster by talking to it through a btrfs filesystem than doing a dd read from the device node. So I suppose I could just ignore the dd device node thing, however not really. Let's take a directory with some source files inside: gandalfthegreat:/var/local# find src | wc -l 15261 gandalfthegreat:/var/local# echo 3 > /proc/sys/vm/drop_caches; time du -sh src 514M src real 0m5.865s So on an encrypted spinning disk, it takes 6 seconds. On my SSD in the same machine with the same encryption and the same btrfs filesystem, I get 5 times slower: gandalfthegreat:/mnt/btrfs_pool1/var/local# time du -sh src 514M src real 0m24.937s Incidently 5x is also the speed difference between my encrypted HD and encrypted SSD with dd. Now, why du is 5x slower and dd of a file from the filesystem is 2.5x faster, I have no idea :-/ See below: 1) drop caches gandalfthegreat:/mnt/btrfs_pool1/var/local/VirtualBox VMs/w2k_virtual# echo 3 > /proc/sys/vm/drop_caches gandalfthegreat:/mnt/btrfs_pool1/var/local/VirtualBox VMs/w2k_virtual# dd if=w2k-s001.vmdk of=/dev/null 2146631680 bytes (2.1 GB) copied, 8.03898 s, 267 MB/s -> 267MB/s reading from the file through the encrypted filesystem. That's good. For comparison gandalfthegreat:/mnt/mnt2# dd if=w2k-s001.vmdk of=/dev/null 2146631680 bytes (2.1 GB) copied, 4.33393 s, 495 MB/s -> almost 500MB/s reading through another unencrypted filesystem on the same SSD gandalfthegreat:/mnt/btrfs_pool1/var/local/VirtualBox VMs/w2k_virtual# dd if=/dev/mapper/ssdcrypt of=/dev/null bs=1M count=1000 1048576000 bytes (1.0 GB) copied, 45.1234 s, 23.2 MB/s -> 23MB/s reading from the block device that my FS is mounted from. WTF? gandalfthegreat:/mnt/btrfs_pool1/var/local/VirtualBox VMs/w2k_virtual# echo 3 > /proc/sys/vm/drop_caches; dd if=w2k-s001.vmdk of=test 2146631680 bytes (2.1 GB) copied, 17.9129 s, 120 MB/s -> 120MB/s copying a file from the SSD to itself. That's not bad. gandalfthegreat:/mnt/btrfs_pool1/var/local/VirtualBox VMs/w2k_virtual# echo 3 > /proc/sys/vm/drop_caches; dd if=test of=/dev/null 2146631680 bytes (2.1 GB) copied, 8.4907 s, 253 MB/s -> reading the new copied file still shows 253MB/s, good. gandalfthegreat:/mnt/btrfs_pool1/var/local/VirtualBox VMs/w2k_virtual# dd if=test of=/dev/null 2146631680 bytes (2.1 GB) copied, 2.11001 s, 1.0 GB/s -> reading without dropping the cache shows 1GB/s I'm very lost now, any idea what's going on? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-22 19:07 [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD Marc MERLIN 2012-07-22 19:47 ` Yves-Alexis Perez @ 2012-07-22 20:22 ` Heinz Diehl 2012-08-12 12:49 ` Pasi Kärkkäinen 2 siblings, 0 replies; 37+ messages in thread From: Heinz Diehl @ 2012-07-22 20:22 UTC (permalink / raw) To: dm-crypt On 22.07.2012, Marc MERLIN wrote: > Timing buffered disk reads: 70 MB in 3.06 seconds = 22.91 MB/sec <<<< I don't know why reading speed is that slow in your case, especially as you are using AES-NI, which should give you the highest speed available. Maybe others here on the list have a suggestion. Probably, you should provide some more information. Otherwise, on the newer Intel i3/i5/i7, twofish-3way is faster than AES. You could try to re-format your drive with twofish-xts-plain64 and adding twofish_common, twofish_x86_64 and twofish_x86_64_3way to your initram (as long as your kernel is built with these enabled). ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-07-22 19:07 [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD Marc MERLIN 2012-07-22 19:47 ` Yves-Alexis Perez 2012-07-22 20:22 ` Heinz Diehl @ 2012-08-12 12:49 ` Pasi Kärkkäinen 2012-08-16 7:43 ` Marc MERLIN 2 siblings, 1 reply; 37+ messages in thread From: Pasi Kärkkäinen @ 2012-08-12 12:49 UTC (permalink / raw) To: Marc MERLIN; +Cc: dm-crypt On Sun, Jul 22, 2012 at 12:07:58PM -0700, Marc MERLIN wrote: > > I got a new Samsumg 830 512GB SSD which is supposed to be very high > performance. > The raw device seems fast enough on a quick hdparm test: > /dev/sda4: > Timing cached reads: 14258 MB in 2.00 seconds = 7136.70 MB/sec > Timing buffered disk reads: 1392 MB in 3.00 seconds = 463.45 MB/sec <<<< > > which is 4x faster than my non encrypted spinning disk, as expected. > > > But once I encrypt it, it drops to 5 times slower than my 1TB spinning > disk in the same laptop: > gandalfthegreat:~# hdparm -tT /dev/mapper/ssdcrypt > /dev/mapper/ssdcrypt: > Timing cached reads: 15412 MB in 2.00 seconds = 7715.37 MB/sec > Timing buffered disk reads: 70 MB in 3.06 seconds = 22.91 MB/sec <<<< > > gandalfthegreat:~# hdparm -tT /dev/mapper/cryptroot (spinning disk) > /dev/mapper/cryptroot: > Timing cached reads: 16222 MB in 2.00 seconds = 8121.03 MB/sec > Timing buffered disk reads: 308 MB in 3.01 seconds = 102.24 MB/sec <<<< > Hello, I didn't read the whole thread, but are you aware that many/most SSDs use internal processors for compression, deduplication, etc .. so if you write encrypted data to the SSD, it's not able to do it's internal magic, and thus you get a lot worse performance compared to non-encrypted data. So did you try benchmarking with *random* data *without* encryption? Also always first write to the disk, and only read after it has been already written to. -- Pasi ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD 2012-08-12 12:49 ` Pasi Kärkkäinen @ 2012-08-16 7:43 ` Marc MERLIN [not found] ` <502D1F96.3080905@andregall.de> 0 siblings, 1 reply; 37+ messages in thread From: Marc MERLIN @ 2012-08-16 7:43 UTC (permalink / raw) To: Pasi Kärkkäinen; +Cc: dm-crypt On Sun, Aug 12, 2012 at 03:49:29PM +0300, Pasi Kärkkäinen wrote: > I didn't read the whole thread, but are you aware that many/most SSDs use > internal processors for compression, deduplication, etc .. Yes > so if you write encrypted data to the SSD, it's not able to do it's internal magic, > and thus you get a lot worse performance compared to non-encrypted data. Only on some controllers like sandforce, the Samsung 830 wasn't supposed to be affected > So did you try benchmarking with *random* data *without* encryption? > Also always first write to the disk, and only read after it has been already written to. Yes, both were parts of my tests. But, I owed everyone an update, which I just finished typing: http://marc.merlins.org/perso/linux/post_2012-08-15_The-tale-of-SSDs_-Crucial-C300-early-Death_-Samsung-830-extreme-random-IO-slowness_-and-settling-with-OCZ-Vertex-4.html Basically, the samsung 830 just sucks. I got 2 of them, they both utterly sucked. There is no excuse for an SSD being several times slower than a slow hard drive on _READs_ (not even talking about writes). I'm not sure how I could have gotten 2 bad drives from Samsung in 2 different shipments, so I'm afraid the entire line may be bad. At least, it was for me after extensive benchmarking, and even using their own windows benchmarking tool. In the end, I got a OCZ Vertex 4 and it's superfast as per the benchmarks I posted in the link above. Cheers, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 37+ messages in thread
[parent not found: <502D1F96.3080905@andregall.de>]
* Re: [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD [not found] ` <502D1F96.3080905@andregall.de> @ 2012-08-16 17:57 ` Marc MERLIN 0 siblings, 0 replies; 37+ messages in thread From: Marc MERLIN @ 2012-08-16 17:57 UTC (permalink / raw) To: André Gall; +Cc: dm-crypt On Thu, Aug 16, 2012 at 06:28:06PM +0200, André Gall wrote: > Hi Marc, > > thank you very much for your detailed investigation. I wanted to buy the > Samsung SSD 830 in the next few days to replace my old SSD. I chose this > drive explicitely because of it's "supposed" performance with encrypted > data according to the benchmarks of incompressible data all around the > internet. Seems like I should rethink this decision :) That's also why I bought it, and obviously it didn't work out. I seem to remember that the OCZ Vertex 4 also does ok with data that doesn't compress. > >Samsung's old Benchmark registered so low on their own Random IO test > >that the bar graph showing speed was a single line at the '0' mark. > > You don't happen to have a screenshot of this? This would make it much > easier for people to understand there's something wrong! Sure thing. Page updated: http://marc.merlins.org/perso/linux/post_2012-08-15_The-tale-of-SSDs_-Crucial-C300-early-Death_-Samsung-830-extreme-random-IO-slowness_-and-settling-with-OCZ-Vertex-4.html Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ ^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2012-08-16 17:57 UTC | newest]
Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-22 19:07 [dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD Marc MERLIN
2012-07-22 19:47 ` Yves-Alexis Perez
2012-07-22 20:39 ` Marc MERLIN
2012-07-22 21:47 ` Arno Wagner
2012-07-23 6:07 ` Yves-Alexis Perez
2012-07-23 6:28 ` Marc MERLIN
2012-07-23 8:14 ` Arno Wagner
2012-07-23 10:46 ` Milan Broz
2012-07-23 11:09 ` Yves-Alexis Perez
2012-07-23 11:37 ` Milan Broz
2012-07-23 15:08 ` André Gall
2012-07-23 17:27 ` André Gall
2012-07-24 14:06 ` Heinz Diehl
2012-07-24 14:16 ` Milan Broz
2012-07-23 16:12 ` Marc MERLIN
2012-07-23 16:19 ` Yves-Alexis Perez
2012-07-23 17:54 ` Marc MERLIN
2012-07-23 19:26 ` Yves-Alexis Perez
2012-07-23 17:15 ` Milan Broz
2012-07-23 17:51 ` Marc MERLIN
2012-07-23 21:31 ` Milan Broz
2012-07-24 5:57 ` Marc MERLIN
2012-07-24 6:25 ` Heinz Diehl
2012-07-24 15:02 ` Marc MERLIN
2012-07-24 15:19 ` Milan Broz
2012-07-24 16:09 ` Marc MERLIN
2012-07-24 13:54 ` Milan Broz
[not found] ` <500E9099.8050501@redhat.com>
2012-07-24 14:27 ` Heinz Diehl
2012-07-24 14:58 ` Heinz Diehl
2012-07-24 15:38 ` Marc MERLIN
2012-07-24 16:48 ` Heinz Diehl
2012-07-24 6:11 ` Heinz Diehl
2012-07-22 21:55 ` Marc MERLIN
2012-07-22 20:22 ` Heinz Diehl
2012-08-12 12:49 ` Pasi Kärkkäinen
2012-08-16 7:43 ` Marc MERLIN
[not found] ` <502D1F96.3080905@andregall.de>
2012-08-16 17:57 ` Marc MERLIN
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.