* Poor snapshot performance in linux-3.19
@ 2015-05-28 10:18 Dennis Yang
2015-05-28 10:45 ` Zdenek Kabelac
0 siblings, 1 reply; 5+ messages in thread
From: Dennis Yang @ 2015-05-28 10:18 UTC (permalink / raw)
To: device-mapper development
Hi,
I have a workstation which runs Fedora 21 with linux-3.19 kernel and
create a thin pool onto of a RAID0 (chunksize = 512KB) with five
Crucial 256GB SSDs.
[root@localhost ~]# dmsetup create pool --table "0 2478300160
thin-pool /dev/md0p1 /dev/md0p2 1024 0 1 skip_block_zeroing"
Then, I create a small thin volume with the following commands.
[root@localhost ~]# dmsetup message pool 0 "create_thin 0"
[root@localhost ~]# dmsetup create thin --table "0 400000000 thin
/dev/mapper/pool 0"
After that, I use both dd and fio for throughput testing and get the
following result.
[root@localhost ~]# dd if=/dev/zero of=/dev/mapper/thin bs=2M count=25k
25600+0 records in
25600+0 records out
53687091200 bytes (54 GB) copied, 29.0871 s, 1.8 GB/s
The 1.8 GB/s throughput looks pretty reasonable to me. However, after
taking a single snapshot of this thin device, I get a pretty low
throughput with the same command.
[root@localhost ~]# dd if=/dev/zero of=/dev/mapper/thin bs=2M count=25k
25600+0 records in
25600+0 records out
53687091200 bytes (54 GB) copied, 191.495 s, 280 MB/s
I am aware of that writing to a snapshotted device will trigger lots
of copy-on-write requests, so I was expecting a 50~60% performance
loss in this case. However, 85% performance loss can be observed in my
test above. Am I doing anything wrong or is there anything I can tune
to make this right? If someone can point direction for me, I am glad
to test or even modify the source code to solve this case.
Thanks,
Dennis
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Poor snapshot performance in linux-3.19
2015-05-28 10:18 Poor snapshot performance in linux-3.19 Dennis Yang
@ 2015-05-28 10:45 ` Zdenek Kabelac
0 siblings, 0 replies; 5+ messages in thread
From: Zdenek Kabelac @ 2015-05-28 10:45 UTC (permalink / raw)
To: device-mapper development
Dne 28.5.2015 v 12:18 Dennis Yang napsal(a):
> Hi,
>
> I have a workstation which runs Fedora 21 with linux-3.19 kernel and
> create a thin pool onto of a RAID0 (chunksize = 512KB) with five
> Crucial 256GB SSDs.
> [root@localhost ~]# dmsetup create pool --table "0 2478300160
> thin-pool /dev/md0p1 /dev/md0p2 1024 0 1 skip_block_zeroing"
>
> Then, I create a small thin volume with the following commands.
> [root@localhost ~]# dmsetup message pool 0 "create_thin 0"
> [root@localhost ~]# dmsetup create thin --table "0 400000000 thin
> /dev/mapper/pool 0"
>
> After that, I use both dd and fio for throughput testing and get the
> following result.
> [root@localhost ~]# dd if=/dev/zero of=/dev/mapper/thin bs=2M count=25k
> 25600+0 records in
> 25600+0 records out
> 53687091200 bytes (54 GB) copied, 29.0871 s, 1.8 GB/s
>
> The 1.8 GB/s throughput looks pretty reasonable to me. However, after
> taking a single snapshot of this thin device, I get a pretty low
> throughput with the same command.
> [root@localhost ~]# dd if=/dev/zero of=/dev/mapper/thin bs=2M count=25k
> 25600+0 records in
> 25600+0 records out
> 53687091200 bytes (54 GB) copied, 191.495 s, 280 MB/s
>
> I am aware of that writing to a snapshotted device will trigger lots
> of copy-on-write requests, so I was expecting a 50~60% performance
> loss in this case. However, 85% performance loss can be observed in my
> test above. Am I doing anything wrong or is there anything I can tune
> to make this right? If someone can point direction for me, I am glad
> to test or even modify the source code to solve this case.
Hi
Using 0.5MB chunks and expecting fast snapshots is not going to work.
Have you measured speed with smaller chunks - i.e. 64k/128k ?
Zdenek
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Poor snapshot performance in linux-3.19
@ 2015-05-29 3:03 Dennis Yang
2015-05-29 15:29 ` Joe Thornber
0 siblings, 1 reply; 5+ messages in thread
From: Dennis Yang @ 2015-05-29 3:03 UTC (permalink / raw)
To: device-mapper development
> Dne 28.5.2015 v 12:18 Dennis Yang napsal(a):
>> Hi,
>>
>> I have a workstation which runs Fedora 21 with linux-3.19 kernel and
>> create a thin pool onto of a RAID0 (chunksize = 512KB) with five
>> Crucial 256GB SSDs.
>> [root@localhost ~]# dmsetup create pool --table "0 2478300160
>> thin-pool /dev/md0p1 /dev/md0p2 1024 0 1 skip_block_zeroing"
>>
>> Then, I create a small thin volume with the following commands.
>> [root@localhost ~]# dmsetup message pool 0 "create_thin 0"
>> [root@localhost ~]# dmsetup create thin --table "0 400000000 thin
>> /dev/mapper/pool 0"
>>
>> After that, I use both dd and fio for throughput testing and get the
>> following result.
>> [root@localhost ~]# dd if=/dev/zero of=/dev/mapper/thin bs=2M count=25k
>> 25600+0 records in
>> 25600+0 records out
>> 53687091200 bytes (54 GB) copied, 29.0871 s, 1.8 GB/s
>>
>> The 1.8 GB/s throughput looks pretty reasonable to me. However, after
>> taking a single snapshot of this thin device, I get a pretty low
>> throughput with the same command.
>> [root@localhost ~]# dd if=/dev/zero of=/dev/mapper/thin bs=2M count=25k
>> 25600+0 records in
>> 25600+0 records out
>> 53687091200 bytes (54 GB) copied, 191.495 s, 280 MB/s
>>
>> I am aware of that writing to a snapshotted device will trigger lots
>> of copy-on-write requests, so I was expecting a 50~60% performance
>> loss in this case. However, 85% performance loss can be observed in my
>> test above. Am I doing anything wrong or is there anything I can tune
>> to make this right? If someone can point direction for me, I am glad
>> to test or even modify the source code to solve this case.
>
>
> Hi
>
> Using 0.5MB chunks and expecting fast snapshots is not going to work.
> Have you measured speed with smaller chunks - i.e. 64k/128k ?
>
> Zdenek
>
Hi,
I have run the same test on pool with 64K/128K block size thin-pool on
linux-3.19 and get these results.
<<< 64k block size - before snapshot>>>
[root@localhost ~]# dd if=/dev/zero of=/dev/mapper/thin bs=2M
count=25k
25600+0 records in
25600+0 records out
53687091200 bytes (54 GB) copied, 205.887 s, 261 MB/s
<<< 64k block size - after snapshot>>>
[root@localhost ~]# dd if=/dev/zero of=/dev/mapper/thin bs=2M
count=25k
25600+0 records in
25600+0 records out
53687091200 bytes (54 GB) copied, 205.887 s, 261 MB/s
<<< 128k block size - before snapshot>>>
[root@localhost ~]# dd if=/dev/zero of=/dev/mapper/thin bs=2M count=25k
25600+0 records in
25600+0 records out
53687091200 bytes (54 GB) copied, 29.5981 s, 1.8 GB/s
<<< 128k block size - after snapshot>>>
[root@localhost ~]# dd if=/dev/zero of=/dev/mapper/thin bs=2M
count=25k
25600+0 records in
25600+0 records out
53687091200 bytes (54 GB) copied, 197.798 s, 271 MB/s
I also run the similar test with fio to observe the throughput across
the snapshot test. The throughput will first reach 1.1GB/s and then
keep bouncing between 700MB/s to 40MB/s.
The average queue size of data device will keep bouncing between 700k
~ 200k, while the average queue size of RAID0 could reach 7 million.
One thing weird is that even after the test is over, the avgqu-sz of
RAID0 is still around 2 million with 100% utilization.
The server I have tested is equipped with Xeon E3-1246 with 4 physical
cores, so I think CPU might not be the bottleneck of the storage
throughput.
Any idea?
Dennis
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Poor snapshot performance in linux-3.19
2015-05-29 3:03 Dennis Yang
@ 2015-05-29 15:29 ` Joe Thornber
0 siblings, 0 replies; 5+ messages in thread
From: Joe Thornber @ 2015-05-29 15:29 UTC (permalink / raw)
To: device-mapper development
On Fri, May 29, 2015 at 11:03:28AM +0800, Dennis Yang wrote:
> The server I have tested is equipped with Xeon E3-1246 with 4 physical
> cores, so I think CPU might not be the bottleneck of the storage
> throughput.
> Any idea?
Is it any better with this patch?
https://github.com/jthornber/linux-2.6/commit/56db3d227ea7e11e42b2b67f8ddc9f6d40c3a3a4
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Poor snapshot performance in linux-3.19
@ 2015-06-01 10:39 Dennis Yang
0 siblings, 0 replies; 5+ messages in thread
From: Dennis Yang @ 2015-06-01 10:39 UTC (permalink / raw)
To: device-mapper development
> Message: 12
> Date: Fri, 29 May 2015 16:29:37 +0100
> From: Joe Thornber <thornber@redhat.com>
> To: device-mapper development <dm-devel@redhat.com>
> Subject: Re: [dm-devel] Poor snapshot performance in linux-3.19
> Message-ID: <20150529152936.GA4297@rh-vpn>
> Content-Type: text/plain; charset=us-ascii
>
> On Fri, May 29, 2015 at 11:03:28AM +0800, Dennis Yang wrote:
>> The server I have tested is equipped with Xeon E3-1246 with 4 physical
>> cores, so I think CPU might not be the bottleneck of the storage
>> throughput.
>> Any idea?
>
> Is it any better with this patch?
>
> https://github.com/jthornber/linux-2.6/commit/56db3d227ea7e11e42b2b67f8ddc9f6d40c3a3a4
>
Hi,
I just gave it a try this morning, but it looks like the performance
is still the same (drop from 1.8GB/s to 270MB/s after snapshot). I
also make some other experiments. If I make a EXT4 filesystem on top
of a thin device with 64K pool block size, the file I/O throughput
after snapshot is great (drop from 1.2GB/s to 1GB/s only) because of
the bio size is 64KB equals to the pool block size. This makes me
start wondering what throughput level we should expect after
snapshotted. Since we are making 1 write to (1 read + 1 write) + 1
write to complete a write request, maybe the best performance we can
get here is around 25~30% of throughput before snapshot. Please
correct me if I get anything wrong.
Thanks,
Dennis
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-06-01 10:39 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-28 10:18 Poor snapshot performance in linux-3.19 Dennis Yang
2015-05-28 10:45 ` Zdenek Kabelac
-- strict thread matches above, loose matches on Subject: below --
2015-05-29 3:03 Dennis Yang
2015-05-29 15:29 ` Joe Thornber
2015-06-01 10:39 Dennis Yang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.