All of lore.kernel.org
 help / color / mirror / Atom feed
* PROBLEM: SSD access time with dm-crypt is way too high
@ 2011-01-09 13:35 Michael Zugelder
  2011-01-09 14:05 ` [dm-devel] " Milan Broz
  2011-01-11 12:07 ` Yakov Hrebtov
  0 siblings, 2 replies; 7+ messages in thread
From: Michael Zugelder @ 2011-01-09 13:35 UTC (permalink / raw)
  To: linux-net, dm-devel; +Cc: jens.axboe, Neil Brown

Hello,

from kernel 2.6.33 on (at least up to 2.6.37) the access time when using
dm-crypt on SSD is way too high. I would expect it to be a bit worse
than directly accessing the device (~0.2ms here), but currently it's at
exactly 10ms (except a few outlines). This corresponds to CONFIG_HZ=100.

The high access time makes the system feel very slow (if you're
expecting SSD performance). On kernel 2.6.32 this wasn't an issue,
access time stayed below 0.3ms. Those numbers are from palimpsest, but
the difference is so drastic, that pretty much everything you can think
of shows a difference (e. g. find-ing all files on the disk goes up from
tens of seconds to multiple minutes). 
Just to be clear: This isn't about a dm-crypt encrypted disk being
slower that an unencrypted disk. It's about the access time of the same
encrypted disk rising to >= 10ms, which is at least an order of
magnitude for an SSD. I suspect with an HDD, the difference isn't much
noticeable.

I filed a bugreport on the Ubuntu bugtracker with a few pretty graphs,
vmstat and LatencyTOP output. If that would be useful, see
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/653611 .

A bisect between .32 and .33 revealed some more information. I arrived
at 79da064, which is a revert of fb1e753 (block: improve
queue_should_plug() by looking at IO depths), so I compiled fb1e753.
With this 2.6.31 kernel, the problem was gone. One step back to 1f98a13,
the access time was high again.

So, it seems that fb1e753 (block: improve queue_should_plug() by looking
at IO depths) improves dm-crypt access time on SSDs (by a factor of over
40 here), but (as in the revert commit stated) reduces sequential
throughput in some cases. I applied fb1e753 to 2.6.37 and the access
time went down to the .32 value.

For me, it seems that dm-crypt interferes with queuing, which totally
destroys access time. If the disk empties the queue after a few
fractions of a millisecond, and it only gets refilled every tick (10ms),
wouldn't this explain the observed behavior? Correspondingly, a 300HZ
kernel achieves about 3ms access time Going all the way up to 1000HZ
hardly improves that number.

Please keep me on CC, since I am not subscribed.

Thanks
Michael


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dm-devel] PROBLEM: SSD access time with dm-crypt is way too high
  2011-01-09 13:35 PROBLEM: SSD access time with dm-crypt is way too high Michael Zugelder
@ 2011-01-09 14:05 ` Milan Broz
  2011-01-10 17:19   ` [PATCH] " Mikulas Patocka
  2011-01-11 12:07 ` Yakov Hrebtov
  1 sibling, 1 reply; 7+ messages in thread
From: Milan Broz @ 2011-01-09 14:05 UTC (permalink / raw)
  To: device-mapper development; +Cc: Michael Zugelder, linux-net, Jens Axboe

On 01/09/2011 02:35 PM, Michael Zugelder wrote:

Hi,

> Just to be clear: This isn't about a dm-crypt encrypted disk being
> slower that an unencrypted disk. It's about the access time of the same
> encrypted disk rising to >= 10ms, which is at least an order of
> magnitude for an SSD. I suspect with an HDD, the difference isn't much
> noticeable.

yep, it was reported several times, thanks for that bisect.
We will need to retest it with per-cpu dm-crypt patches
(should be in linux-next soon) but I expect that this problem remains.

> I filed a bugreport on the Ubuntu bugtracker with a few pretty graphs,
> vmstat and LatencyTOP output. If that would be useful, see
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/653611 .

(Maybe Ubuntu people can forward issue to upstream list next time?)

> A bisect between .32 and .33 revealed some more information. I arrived
> at 79da064, which is a revert of fb1e753 (block: improve
> queue_should_plug() by looking at IO depths), so I compiled fb1e753.
> With this 2.6.31 kernel, the problem was gone. One step back to 1f98a13,
> the access time was high again.
> 
> So, it seems that fb1e753 (block: improve queue_should_plug() by looking
> at IO depths) improves dm-crypt access time on SSDs (by a factor of over
> 40 here), but (as in the revert commit stated) reduces sequential
> throughput in some cases. I applied fb1e753 to 2.6.37 and the access
> time went down to the .32 value.
> 
> For me, it seems that dm-crypt interferes with queuing, which totally
> destroys access time.

Yes, dm-crypt clones requests and process them in its own queue so there
is some interference.

I am not sure if this is fixable by block layer only patch (like fb1e753),

Maybe dm-crypt should call unplug explicitly, flush crypt workqueue
explicitly or something like that... dunno

Jens, do you have an idea how to improve it and not lose throughput?

Milan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] PROBLEM: SSD access time with dm-crypt is way too high
  2011-01-09 14:05 ` [dm-devel] " Milan Broz
@ 2011-01-10 17:19   ` Mikulas Patocka
  2011-01-10 17:50     ` Milan Broz
                       ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Mikulas Patocka @ 2011-01-10 17:19 UTC (permalink / raw)
  To: Michael Zugelder
  Cc: linux-net, device-mapper development, Jens Axboe, Milan Broz



On Sun, 9 Jan 2011, Milan Broz wrote:

> On 01/09/2011 02:35 PM, Michael Zugelder wrote:
> 
> Hi,
> 
> > Just to be clear: This isn't about a dm-crypt encrypted disk being
> > slower that an unencrypted disk. It's about the access time of the same
> > encrypted disk rising to >= 10ms, which is at least an order of
> > magnitude for an SSD. I suspect with an HDD, the difference isn't much
> > noticeable.
> 
> yep, it was reported several times, thanks for that bisect.
> We will need to retest it with per-cpu dm-crypt patches
> (should be in linux-next soon) but I expect that this problem remains.
> 
> > I filed a bugreport on the Ubuntu bugtracker with a few pretty graphs,
> > vmstat and LatencyTOP output. If that would be useful, see
> > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/653611 .
> 
> (Maybe Ubuntu people can forward issue to upstream list next time?)
> 
> > A bisect between .32 and .33 revealed some more information. I arrived
> > at 79da064, which is a revert of fb1e753 (block: improve
> > queue_should_plug() by looking at IO depths), so I compiled fb1e753.
> > With this 2.6.31 kernel, the problem was gone. One step back to 1f98a13,
> > the access time was high again.
> > 
> > So, it seems that fb1e753 (block: improve queue_should_plug() by looking
> > at IO depths) improves dm-crypt access time on SSDs (by a factor of over
> > 40 here), but (as in the revert commit stated) reduces sequential
> > throughput in some cases. I applied fb1e753 to 2.6.37 and the access
> > time went down to the .32 value.
> > 
> > For me, it seems that dm-crypt interferes with queuing, which totally
> > destroys access time.
> 
> Yes, dm-crypt clones requests and process them in its own queue so there
> is some interference.
> 
> I am not sure if this is fixable by block layer only patch (like fb1e753),
> 
> Maybe dm-crypt should call unplug explicitly, flush crypt workqueue
> explicitly or something like that... dunno
> 
> Jens, do you have an idea how to improve it and not lose throughput?
> 
> Milan
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
> 

Hi

Try this patch. I coded it long time ago, but it was forgotten somehow.

Mikulas

---

Unplug the queue when all IOs are finished.

If we don't unplug the queue, there will be 3ms delay (specified in
block/blk-settings.c, blk_queue_make_request) before the requests is submitted
to the device. So we must unplug the queue.

To avoid excessive unplugging, we must unplug only when we finish processing the
last request.

dm-crypt uses workqueue to queue the IOs. Unfortunatelly, the workqueue API
doesn't allow to query whether there are some more requests pending on the
queue. There API provides function to query "is this work queued on any
workqueue?", but it doesn't provide a function to query "is there any work
pending on this workqueue?".

There are two approaches that I considered:

1. make a special work for unplug. After queuing each IO's work, cancel the
unplug work and queue it again (so that it will always be queued as a last
entry). Unfortunatelly, canceling a work is rather slow operation so I decided
to not use this approach.

2. use a special pointer that points to the last IO. When the IO is finished
and the pointer matches this IO, we know that it was the last IO and we should
unplug. This patch implements this approach.

The pointer is not protected by any locks. So there may be race conditions with
it's manipulation (for example, on architectures with non-atomic memory writes,
simultaneous writes to the pointer may make it to point to garbage). However,
the pointer is never dereferenced, so this shouldn't cause crashes. The race
condition may only cause that unplug is missed and this triggers the 3ms delay.

Write requests are dispatched directly from kcryptd workqueue, so wee need an
unplug for it as well.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-crypt.c |   36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

Index: linux-2.6.36-rc3-fast/drivers/md/dm-crypt.c
===================================================================
--- linux-2.6.36-rc3-fast.orig/drivers/md/dm-crypt.c	2010-09-02 22:50:52.000000000 +0200
+++ linux-2.6.36-rc3-fast/drivers/md/dm-crypt.c	2010-09-02 22:59:01.000000000 +0200
@@ -107,6 +107,16 @@ struct crypt_config {
 	struct workqueue_struct *io_queue;
 	struct workqueue_struct *crypt_queue;
 
+  	/*
+	 * The last io submitted to io_queue and crypt_queue.
+	 * This pointers are not protected by any locks and thus must not be
+	 * dereferenced --- they may contain garbage. The only allowed operation
+	 * is to compare these pointers with current io being processed. If they
+	 * match, the block device queue should be unplugged.
+	 */
+	struct dm_crypt_io *io_queue_last_io;
+	struct dm_crypt_io *crypt_queue_last_io;
+
 	char *cipher;
 	char *cipher_mode;
 
@@ -731,17 +741,29 @@ static void kcryptd_io_write(struct dm_c
 static void kcryptd_io(struct work_struct *work)
 {
 	struct dm_crypt_io *io = container_of(work, struct dm_crypt_io, work);
+	struct crypt_config *cc = io->target->private;
 
 	if (bio_data_dir(io->base_bio) == READ)
 		kcryptd_io_read(io);
 	else
 		kcryptd_io_write(io);
+
+	/*
+	 * Neither io nor cc->io_queue_last_io may be dereferenced here.
+	 * This check just checks if it was the last io.
+	 */
+	if (io == cc->io_queue_last_io) {
+		struct request_queue *q = bdev_get_queue(cc->dev->bdev);
+		blk_unplug(q);
+		cc->io_queue_last_io = NULL;
+	}
 }
 
 static void kcryptd_queue_io(struct dm_crypt_io *io)
 {
 	struct crypt_config *cc = io->target->private;
 
+	cc->io_queue_last_io = io;
 	INIT_WORK(&io->work, kcryptd_io);
 	queue_work(cc->io_queue, &io->work);
 }
@@ -915,17 +937,31 @@ static void kcryptd_async_done(struct cr
 static void kcryptd_crypt(struct work_struct *work)
 {
 	struct dm_crypt_io *io = container_of(work, struct dm_crypt_io, work);
+	struct crypt_config *cc = io->target->private;
 
 	if (bio_data_dir(io->base_bio) == READ)
 		kcryptd_crypt_read_convert(io);
 	else
 		kcryptd_crypt_write_convert(io);
+
+	/*
+	 * Neither io nor cc->crypt_queue_last_io may be dereferenced here.
+	 * This check just checks if it was the last writing io.
+	 */
+	if (io == cc->crypt_queue_last_io) {
+		struct request_queue *q = bdev_get_queue(cc->dev->bdev);
+		blk_unplug(q);
+		cc->crypt_queue_last_io = NULL;
+	}
 }
 
 static void kcryptd_queue_crypt(struct dm_crypt_io *io)
 {
 	struct crypt_config *cc = io->target->private;
 
+	/* Write ios need unplugging in kcryptd work thread */
+	if (bio_data_dir(io->base_bio) == WRITE)
+		cc->crypt_queue_last_io = io;
 	INIT_WORK(&io->work, kcryptd_crypt);
 	queue_work(cc->crypt_queue, &io->work);
 }

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PROBLEM: SSD access time with dm-crypt is way too high
  2011-01-10 17:19   ` [PATCH] " Mikulas Patocka
@ 2011-01-10 17:50     ` Milan Broz
  2011-01-10 18:47     ` [dm-devel] " Christoph Hellwig
  2011-01-10 20:06     ` Michael Zugelder
  2 siblings, 0 replies; 7+ messages in thread
From: Milan Broz @ 2011-01-10 17:50 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Michael Zugelder, linux-net, device-mapper development,
	Jens Axboe

On 01/10/2011 06:19 PM, Mikulas Patocka wrote:
> Try this patch. I coded it long time ago, but it was forgotten somehow.

No, it was not forgotten, I just do not like this approach :)

> +  	/*
> +	 * The last io submitted to io_queue and crypt_queue.
> +	 * This pointers are not protected by any locks and thus must not be
> +	 * dereferenced --- they may contain garbage. The only allowed operation
> +	 * is to compare these pointers with current io being processed. If they
> +	 * match, the block device queue should be unplugged.
> +	 */
> +	struct dm_crypt_io *io_queue_last_io;
> +	struct dm_crypt_io *crypt_queue_last_io;

We can now process ios in parallel on different CPUs, that code will
rewrite the context struct on every IO and at least it will cause cache bouncing.
(it will not apply on current code anyway, see patches in linux-next)

And I think it can kill performance is some cases (if the requests are submitted
such way that you will call unplug after every io.)

We need something clever here - I wonder if fixing the reverted fb1e753 can
help here....

Milan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dm-devel] [PATCH] PROBLEM: SSD access time with dm-crypt is way too high
  2011-01-10 17:19   ` [PATCH] " Mikulas Patocka
  2011-01-10 17:50     ` Milan Broz
@ 2011-01-10 18:47     ` Christoph Hellwig
  2011-01-10 20:06     ` Michael Zugelder
  2 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2011-01-10 18:47 UTC (permalink / raw)
  To: device-mapper development
  Cc: Michael Zugelder, linux-net, Jens Axboe, Milan Broz

On Mon, Jan 10, 2011 at 12:19:25PM -0500, Mikulas Patocka wrote:
> There are two approaches that I considered:
> 
> 1. make a special work for unplug. After queuing each IO's work, cancel the
> unplug work and queue it again (so that it will always be queued as a last
> entry). Unfortunatelly, canceling a work is rather slow operation so I decided
> to not use this approach.
> 
> 2. use a special pointer that points to the last IO. When the IO is finished
> and the pointer matches this IO, we know that it was the last IO and we should
> unplug. This patch implements this approach.

What about keeping a reference count of pending I/O requests in a shared
structure, with the last decrement doing the unplug?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PROBLEM: SSD access time with dm-crypt is way too high
  2011-01-10 17:19   ` [PATCH] " Mikulas Patocka
  2011-01-10 17:50     ` Milan Broz
  2011-01-10 18:47     ` [dm-devel] " Christoph Hellwig
@ 2011-01-10 20:06     ` Michael Zugelder
  2 siblings, 0 replies; 7+ messages in thread
From: Michael Zugelder @ 2011-01-10 20:06 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Michael Zugelder, linux-net, device-mapper development,
	Jens Axboe, Milan Broz

On Mon, 2011-01-10 at 12:19 -0500, Mikulas Patocka wrote: 
> Try this patch.

Hi,

just compiled it on top of 2.6.37 without the fb1e753 and ran seeker [1]
for 100 seconds:

raw disk:       0.2173ms
fb1e753 patch:  0.2643ms
your patch:	0.2676ms

Throughput is within 5% of max, even with the 10ms access time.

On Mon, 2011-01-10 at 18:50 +0100, Milan Broz wrote:
> We can now process ios in parallel on different CPUs, that code will
> rewrite the context struct on every IO and at least it will cause cache bouncing.

I should say that those numbers come from a CRYPTO_PCRYPT=n kernel.


Anyone got an idea for a good testcase? I tried replicating the issues
with with a loop'd file on a tmpfs and failed. Using a usb-attached HDD
also didn't work.


Michael

[1] http://www.linuxinsight.com/how_fast_is_your_disk.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: PROBLEM: SSD access time with dm-crypt is way too high
  2011-01-09 13:35 PROBLEM: SSD access time with dm-crypt is way too high Michael Zugelder
  2011-01-09 14:05 ` [dm-devel] " Milan Broz
@ 2011-01-11 12:07 ` Yakov Hrebtov
  1 sibling, 0 replies; 7+ messages in thread
From: Yakov Hrebtov @ 2011-01-11 12:07 UTC (permalink / raw)
  To: device-mapper development; +Cc: michaelzugelder

09.01.2011 18:35, Michael Zugelder wrote:
> from kernel 2.6.33 on (at least up to 2.6.37) the access time when using
> dm-crypt on SSD is way too high. I would expect it to be a bit worse
> than directly accessing the device (~0.2ms here), but currently it's at
> exactly 10ms (except a few outlines). This corresponds to CONFIG_HZ=100.

I have similar problems, but 2.6.32 is also affected for me.
Look details here: https://bugzilla.kernel.org/show_bug.cgi?id=17892

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-01-11 12:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-09 13:35 PROBLEM: SSD access time with dm-crypt is way too high Michael Zugelder
2011-01-09 14:05 ` [dm-devel] " Milan Broz
2011-01-10 17:19   ` [PATCH] " Mikulas Patocka
2011-01-10 17:50     ` Milan Broz
2011-01-10 18:47     ` [dm-devel] " Christoph Hellwig
2011-01-10 20:06     ` Michael Zugelder
2011-01-11 12:07 ` Yakov Hrebtov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.