dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* dm-crypt parallelization patches
@ 2013-09-03 19:06 Mikulas Patocka
  2013-09-03 20:07 ` Andi Kleen
  2013-09-03 20:49 ` Milan Broz
  0 siblings, 2 replies; 22+ messages in thread
From: Mikulas Patocka @ 2013-09-03 19:06 UTC (permalink / raw)
  To: Andi Kleen; +Cc: dm-devel, Milan Broz, Alasdair G. Kergon

Hi Andi

You wrote the original dm-crypt paralllelization patch (commit 
c029772125594e31eb1a5ad9e0913724ed9891f2). This patch improves 
parallelization, it makes data being encrypted on the same CPU that 
submitted the request.

I improved that so that encryption is performed by all processors, it uses 
unbound workqueue, so that any processor can do encryption. My patches 
improves performance mainly when doing sequential read - your original 
patch paralellized sequential read badly because most requests were 
submitted by the same cpu.

I put my patch series at 
http://people.redhat.com/mpatocka/patches/kernel/dm-crypt-paralelizace/current/series.html

I'd like if you could look at it and ack it for inclusion in the upstream 
kernel.

Mikulas

^ permalink raw reply	[flat|nested] 22+ messages in thread
* dm-crypt performance
@ 2013-03-26  3:47 Mikulas Patocka
  2013-03-26 12:27 ` [dm-devel] " Alasdair G Kergon
  0 siblings, 1 reply; 22+ messages in thread
From: Mikulas Patocka @ 2013-03-26  3:47 UTC (permalink / raw)
  To: Mike Snitzer, Alasdair G. Kergon, dm-devel

Hi

I performed some dm-crypt performance tests as Mike suggested.

It turns out that unbound workqueue performance has improved somewhere 
between kernel 3.2 (when I made the dm-crypt patches) and 3.8, so the 
patches for hand-built dispatch are no longer needed.

For RAID-0 composed of two disks with total throughput 260MB/s, the 
unbound workqueue performs as well as the hand-built dispatch (both 
sustain the 260MB/s transfer rate).

For ramdisk, unbound workqueue performs better than hand-built dispatch 
(620MB/s vs 400MB/s). Unbound workqueue with the patch that Mike suggested 
(git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git) improves 
performance slighlty on ramdisk compared to 3.8 (700MB/s vs. 620MB/s).



However, there is still the problem with request ordering. Milan found out 
that under some circumstances parallel dm-crypt has worse performance than 
the previous dm-crypt code. I found out that this is not caused by 
deficiencies in the code that distributes work to individual processors. 
Performance drop is caused by the fact that distributing write bios to 
multiple processors causes the encryption to finish out of order and the 
I/O scheduler is unable to merge these out-of-order bios.

The deadline and noop schedulers perform better (only 50% slowdown 
compared to old dm-crypt), CFQ performs very badly (8 times slowdown).


If I sort the requests in dm-crypt to come out in the same order as they 
were received, there is no longer any slowdown, the new crypt performs as 
well as the old crypt, but the last time I submitted the patches, people 
objected to sorting requests in dm-crypt, saying that the I/O scheduler 
should sort them. But it doesn't. This problem still persists in the 
current kernels.


For best performance we could use the unbound workqueue implementation 
with request sorting, if people don't object to the request sorting being 
done in dm-crypt.


Mikulas

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2013-09-12  3:51 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-03 19:06 dm-crypt parallelization patches Mikulas Patocka
2013-09-03 20:07 ` Andi Kleen
2013-09-11 23:03   ` Mikulas Patocka
2013-09-11 23:33     ` Andi Kleen
2013-09-12  3:51     ` Milan Broz
2013-09-03 20:49 ` Milan Broz
  -- strict thread matches above, loose matches on Subject: below --
2013-03-26  3:47 dm-crypt performance Mikulas Patocka
2013-03-26 12:27 ` [dm-devel] " Alasdair G Kergon
2013-03-26 20:05   ` Milan Broz
2013-03-26 20:28     ` Mike Snitzer
2013-03-28 18:53       ` Tejun Heo
2013-03-28 19:33         ` Vivek Goyal
2013-03-28 19:44           ` Tejun Heo
2013-03-28 20:38             ` Vivek Goyal
2013-03-28 20:45               ` Tejun Heo
2013-04-09 17:51                 ` dm-crypt parallelization patches Mikulas Patocka
2013-04-09 17:57                   ` Tejun Heo
2013-04-09 18:08                     ` Mikulas Patocka
2013-04-09 18:10                       ` Tejun Heo
2013-04-09 18:42                         ` Vivek Goyal
2013-04-09 18:57                           ` Tejun Heo
2013-04-09 19:13                             ` Vivek Goyal
2013-04-09 19:42                         ` Mikulas Patocka
2013-04-09 19:52                           ` Tejun Heo
2013-04-09 20:32                             ` Mikulas Patocka
2013-04-09 21:02                               ` Tejun Heo
2013-04-09 21:03                                 ` Tejun Heo
2013-04-09 21:07                               ` Vivek Goyal
2013-04-09 21:18                                 ` Mikulas Patocka
2013-04-10 19:24                                   ` Vivek Goyal
2013-04-09 18:36                   ` Vivek Goyal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).