[linux-lvm] advice for curing terrible snapshot performance?

linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed

* [linux-lvm] advice for curing terrible snapshot performance?
@ 2010-11-12 21:52 chris (fool) mccraw
  2010-11-12 22:28 ` Joe Pruett
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: chris (fool) mccraw @ 2010-11-12 21:52 UTC (permalink / raw)
  To: linux-lvm

hi folks,

i'm new to linux lvm but a longtime user of lvm's on other commercial
unices.  i love some of the features like r/w snapshots!  and indeed
snapshots are the primary reason i'm even interested in using LVM on
linux.

however, snapshots really shoot my system in the foot.  in my (12
processor, 12GB RAM, x86_64 centos 5.5) server, i have two pricey
areca hardware raid cards that give me ridiculous write performance:
i can sustain over 700MByte/sec write speeds (writing a file with dd
if=/dev/zero bs=1M, twice the size of the raid card's onboard memory
and timing the write + a sync afterwards) to the volumes on either
card (both backed by 7 or more fast disks).  enabling a single
snapshot reduces that speed by a factor of 10!  enabling more
snapshots isn't as drastic but still doesn't really scale well:

no snapshot  = ~11sec (727MB/sec)
1 snapshot   = ~102sec (78MB/sec)
2 snapshots  = ~144sec (55MB/sec)
3 snapshots  = ~313sec (25MB/sec)
4 snapshots  = ~607sec (15MB/sec)

i have my snapshots set up a separate array from the master
filesystem, on a separate raid card.  i did not change the default
parameters for the setup (ie blocksize) because our typical workload
is reading and writing small (<64k) files.  i can copy non-snapshotted
files from an LVM on array1 to array2 at a good clip, 307MByte/sec
(including sync).  copies from the parent array to itself (no
snapshots enabled) go at about 220MByte/sec.

all of my measurements were repeated 4x and averaged--occasionally
there was one that was a good 30% faster than the other 3, but it was
always an outlier.  typically all measurements for a given scenario
were within 10% of eachother.

so i guess that as a replacement for a netapp, setup with a few hourly
& daily, and even one weekly snapshot isn't something people do with
stock linux LVM?  or am i just doing it wrong?

in searching the archives i heard about zumastor.  is that really
production-ready?  the no-new-releases in the last 2 years and not
being in the mainstream kernel makes me leery of it.  i think we can
live with the factor-of-10 performance degradation on a daily
basis--we can turn off all the snapshots in case we really have to
hammer the server (which has a 4Gbit uplink to a render farm, so it is
possible for us to actually write over 70MByte/sec when things are
humming, via NFS), and in general it serves SMB@closer to 400Mbit
than 4000, so all the desktop users will not notice a difference.

it seems that others have seen these problems:
http://www.nikhef.nl/~dennisvd/lvmcrap.html as an example.

any thoughts?

thanks in advance for your input!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-12 21:52 [linux-lvm] advice for curing terrible snapshot performance? chris (fool) mccraw
@ 2010-11-12 22:28 ` Joe Pruett
  2010-11-12 23:30   ` chris (fool) mccraw
  2010-11-12 23:36   ` Joe Pruett
  2010-11-15 14:35 ` Romeo Theriault
  2010-11-15 20:37 ` Stephane Chazelas
  2 siblings, 2 replies; 17+ messages in thread
From: Joe Pruett @ 2010-11-12 22:28 UTC (permalink / raw)
  To: LVM general discussion and development

my understanding of how the lvm does snapshots is that a write causes a
lookup to see if that extent is dirty or clean.  a dirty extent just
writes directly, a clean extent causes a copy from the main volume to
the snapshot volume and some amount of bookkeeping to track that the
block is now dirty and then the write completes to the main lv.  so a
test where you are creating a bunch of updates would cause a write to
turn into a read and two writes, so i'd expect more like a 3x hit.  and
i guess that the bookkeeping may have some sync calls in it, which could
cause some major stalls as caches flush.

have you tested the snapshots under normal load, and not test load?  you
may be seeing the worst possible behavior (which is good to know), but
may not really occur under typical usage.

On 11/12/2010 01:52 PM, chris (fool) mccraw wrote:
> hi folks,
>
> i'm new to linux lvm but a longtime user of lvm's on other commercial
> unices.  i love some of the features like r/w snapshots!  and indeed
> snapshots are the primary reason i'm even interested in using LVM on
> linux.
>
> however, snapshots really shoot my system in the foot.  in my (12
> processor, 12GB RAM, x86_64 centos 5.5) server, i have two pricey
> areca hardware raid cards that give me ridiculous write performance:
> i can sustain over 700MByte/sec write speeds (writing a file with dd
> if=/dev/zero bs=1M, twice the size of the raid card's onboard memory
> and timing the write + a sync afterwards) to the volumes on either
> card (both backed by 7 or more fast disks).  enabling a single
> snapshot reduces that speed by a factor of 10!  enabling more
> snapshots isn't as drastic but still doesn't really scale well:
>
>
> no snapshot  = ~11sec (727MB/sec)
> 1 snapshot   = ~102sec (78MB/sec)
> 2 snapshots  = ~144sec (55MB/sec)
> 3 snapshots  = ~313sec (25MB/sec)
> 4 snapshots  = ~607sec (15MB/sec)
>
>
> i have my snapshots set up a separate array from the master
> filesystem, on a separate raid card.  i did not change the default
> parameters for the setup (ie blocksize) because our typical workload
> is reading and writing small (<64k) files.  i can copy non-snapshotted
> files from an LVM on array1 to array2 at a good clip, 307MByte/sec
> (including sync).  copies from the parent array to itself (no
> snapshots enabled) go at about 220MByte/sec.
>
> all of my measurements were repeated 4x and averaged--occasionally
> there was one that was a good 30% faster than the other 3, but it was
> always an outlier.  typically all measurements for a given scenario
> were within 10% of eachother.
>
> so i guess that as a replacement for a netapp, setup with a few hourly
> & daily, and even one weekly snapshot isn't something people do with
> stock linux LVM?  or am i just doing it wrong?
>
> in searching the archives i heard about zumastor.  is that really
> production-ready?  the no-new-releases in the last 2 years and not
> being in the mainstream kernel makes me leery of it.  i think we can
> live with the factor-of-10 performance degradation on a daily
> basis--we can turn off all the snapshots in case we really have to
> hammer the server (which has a 4Gbit uplink to a render farm, so it is
> possible for us to actually write over 70MByte/sec when things are
> humming, via NFS), and in general it serves SMB at closer to 400Mbit
> than 4000, so all the desktop users will not notice a difference.
>
> it seems that others have seen these problems:
> http://www.nikhef.nl/~dennisvd/lvmcrap.html as an example.
>
> any thoughts?
>
> thanks in advance for your input!
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-12 22:28 ` Joe Pruett
@ 2010-11-12 23:30   ` chris (fool) mccraw
  2010-11-12 23:36   ` Joe Pruett
  1 sibling, 0 replies; 17+ messages in thread
From: chris (fool) mccraw @ 2010-11-12 23:30 UTC (permalink / raw)
  To: LVM general discussion and development

On Fri, Nov 12, 2010 at 14:28, Joe Pruett <joey@q7.com> wrote:
> my understanding of how the lvm does snapshots is that a write causes a
> lookup to see if that extent is dirty or clean. �a dirty extent just
> writes directly, a clean extent causes a copy from the main volume to
> the snapshot volume and some amount of bookkeeping to track that the
> block is now dirty and then the write completes to the main lv. �so a
> test where you are creating a bunch of updates would cause a write to
> turn into a read and two writes, so i'd expect more like a 3x hit.

i would also expect that.  it is reasonable and seems to be designed
s.t. that's the best case.  but that does not seem to be anyone's
experience, that i could find.  does anyone else out there have
performance numbers that don't suck?  i stopped short of setting up a
ramdisk to hold the snapshots because that seems ridiculous and scales
terribly too..

i found one post to this list over a year ago about it, and Dennis'
web page (http://www.nikhef.nl/~dennisvd/lvmcrap.html) all firmly
corroborating my numbers.

> and
> i guess that the bookkeeping may have some sync calls in it, which could
> cause some major stalls as caches flush.

i want caches to flush;  i want to know what the actual speed to write
to disk is.

> have you tested the snapshots under normal load, and not test load? �you
> may be seeing the worst possible behavior (which is good to know), but
> may not really occur under typical usage.

sequential writes to a near-empty filesystem from a single non-network
source while the machine is otherwise quiet aren't exactly worst-case
situations, IMNSHO.

the fact that the machine became unavailable as a samba server (still
pingable and became functional again once i stopped writing) while i
was doing the tests with 4 snapshots (single writer thread) does not
lead me to want to put it into production with multiple writer threads
to see if that just happens to work for awhile.  i can vouch for the
fact that it doesn't work well when writing 0's locally--how does
throwing multiple network writers with samba and nfs into the picture
make things more transparent or even less load-inducing?  couple that
with a backup reading from a snapshot at full throttle (a likely
scenario) and things aren't looking too rosy.  at this point my
fanless 1ghz via home server with a single PATA disk does a better job
serving files than this $30,000 beast with 24x the ram, disks, and
ghz, when as few as 2 snapshots are enabled.

while the machine may do ok under real-world load, there is 0 chance
of me selling it as a solution to the team when my dinky test can
bring things to a halt, so we'll never know.  (if i can find settings
that approximate the 3x penalty and scale linearly, this becomes far
more likely!)  this group already uses rsnapshot and it works well (at
the cost of enough room to keep a complete backup online along with
deltas)--it certainly isn't as instant to create a snap, nor even
trying to be real-time updated, but it also doesn't bog the machine
down horribly when keeping, quite literally, 50 snapshots around
(possibly because rsnapshot runs single-threaded leaving the other 11
procs to serve files, etc, whereas having X snapshots seems to lead to
X kcopyd's all using as much cpu as they can?).  unless i'm doing
something terribly wrong, there is just no way that more than a
handful of snapshots will lead to reasonable performance--writing a
100mb file (not unreasonable real-world workload simulation) with 10
snapshots takes almost a minute.

please tell me i'm doing something terribly wrong =)  i want this to
work, but so far it doesn't seem like this technology is actually a
reasonable replacement for netapp style snapshots, at least not in
snapshot quantities >1?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-12 22:28 ` Joe Pruett
  2010-11-12 23:30   ` chris (fool) mccraw
@ 2010-11-12 23:36   ` Joe Pruett
  2010-11-13  0:17     ` chris (fool) mccraw
  1 sibling, 1 reply; 17+ messages in thread
From: Joe Pruett @ 2010-11-12 23:36 UTC (permalink / raw)
  To: LVM general discussion and development

i just did a bit of poking around and discovered that snapshots have
their own chunk size that is used for the copy on write magic.  and it
defaults to 4k, and you can only increase that to 512k.  a simple test
of creating a 1g file went from 240mbytes/sec to 4mbytes/sec with 4k
chunk, and 12mbytes/sec with 512k chunk.  so i'm not sure what the
bottleneck is, but is surely is there.

On 11/12/2010 02:28 PM, Joe Pruett wrote:
> my understanding of how the lvm does snapshots is that a write causes a
> lookup to see if that extent is dirty or clean.  a dirty extent just
> writes directly, a clean extent causes a copy from the main volume to
> the snapshot volume and some amount of bookkeeping to track that the
> block is now dirty and then the write completes to the main lv.  so a
> test where you are creating a bunch of updates would cause a write to
> turn into a read and two writes, so i'd expect more like a 3x hit.  and
> i guess that the bookkeeping may have some sync calls in it, which could
> cause some major stalls as caches flush.
>
> have you tested the snapshots under normal load, and not test load?  you
> may be seeing the worst possible behavior (which is good to know), but
> may not really occur under typical usage.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-12 23:36   ` Joe Pruett
@ 2010-11-13  0:17     ` chris (fool) mccraw
  2010-11-13  0:58       ` Stuart D Gathman
  2010-11-15 18:05       ` chris (fool) mccraw
  0 siblings, 2 replies; 17+ messages in thread
From: chris (fool) mccraw @ 2010-11-13  0:17 UTC (permalink / raw)
  To: LVM general discussion and development

On Fri, Nov 12, 2010 at 15:36, Joe Pruett <joey@q7.com> wrote:
> i just did a bit of poking around and discovered that snapshots have
> their own chunk size that is used for the copy on write magic.

indeed.  i was hoping someone would advise me if there was a better
chunk size, which is why i said what i thought i was using (default =
64k) in my first post.

> and it
> defaults to 4k, and you can only increase that to 512k. �a simple test
> of creating a 1g file went from 240mbytes/sec to 4mbytes/sec with 4k
> chunk, and 12mbytes/sec with 512k chunk. �so i'm not sure what the
> bottleneck is, but is surely is there.

interestingly the default snapshot chunk size on my system:

  LVM version:     2.02.56(1)-RHEL5 (2010-04-22)
  Library version: 1.02.39-RHEL5 (2010-04-22)
  Driver version:  4.11.5

is 4k.  a tutorial i was reading suggested it was 64k, and i didn't
doublecheck if that was true.  i am going to have to wait til after
business hours to run more thorough tests, but i still see a slowdown
way over 10x even at 64k chunk size with a single snapshot.  i'll try
it at all the different available chunk sizes and report back by
monday.

still curious about Zumastor--does anyone use this in production?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-13  0:17     ` chris (fool) mccraw
@ 2010-11-13  0:58       ` Stuart D Gathman
  2010-11-15 17:52         ` chris (fool) mccraw
  2010-11-15 18:05       ` chris (fool) mccraw
  1 sibling, 1 reply; 17+ messages in thread
From: Stuart D Gathman @ 2010-11-13  0:58 UTC (permalink / raw)
  To: linux-lvm

On 11/12/2010 07:17 PM, chris (fool) mccraw wrote:
> still curious about Zumastor--does anyone use this in production?
Zumastor would not speed up a single snapshot.  It solves the problem of
supporting multiple (as in lots of) snapsnots with the same performance
as a single snapshot (so that taking a snap every 5mins becomes
practical).  BTW, rewriting that 1G file would be normal speed, since
the modified chunks have already been copied to the snapshot.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-13  0:58       ` Stuart D Gathman
@ 2010-11-15 17:52         ` chris (fool) mccraw
  2010-11-15 18:04           ` Romeo Theriault
                             ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: chris (fool) mccraw @ 2010-11-15 17:52 UTC (permalink / raw)
  To: LVM general discussion and development

On Fri, Nov 12, 2010 at 16:58, Stuart D Gathman <stuart@bmsi.com> wrote:
> On 11/12/2010 07:17 PM, chris (fool) mccraw wrote:
>> still curious about Zumastor--does anyone use this in production?
> Zumastor would not speed up a single snapshot. ï¿½It solves the problem of
> supporting multiple (as in lots of) snapsnots with the same performance
> as a single snapshot (so that taking a snap every 5mins becomes
> practical).

I realize this.  like i said, we can probably eat the performance hit
for a single snapshot, but more than one is just not workable so far.

But I still haven't heard back from anyone who finds it production
ready, so it is strting to feel entirely academic.

> BTW, rewriting that 1G file would be normal speed, since
> the modified chunks have already been copied to the snapshot.

i'd think that, and you'd think that, but it is not the case.  most of
my tests were done by rewriting the file 4x, and while the snap %used
(monitored with the 'lvs' command) doesn't keep going up, performance
stays the same.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-15 17:52         ` chris (fool) mccraw
@ 2010-11-15 18:04           ` Romeo Theriault
  2010-11-15 18:08           ` Joe Pruett
  2010-11-15 23:51           ` Stuart D. Gathman
  2 siblings, 0 replies; 17+ messages in thread
From: Romeo Theriault @ 2010-11-15 18:04 UTC (permalink / raw)
  To: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 843 bytes --]

On Tue, Nov 16, 2010 at 02:52, chris (fool) mccraw <gently@gmail.com> wrote:

> On Fri, Nov 12, 2010 at 16:58, Stuart D Gathman <stuart@bmsi.com> wrote:
> > On 11/12/2010 07:17 PM, chris (fool) mccraw wrote:
> >> still curious about Zumastor--does anyone use this in production?
> > Zumastor would not speed up a single snapshot.  It solves the problem of
> > supporting multiple (as in lots of) snapsnots with the same performance
> > as a single snapshot (so that taking a snap every 5mins becomes
> > practical).
>
> I realize this.  like i said, we can probably eat the performance hit
> for a single snapshot, but more than one is just not workable so far.
>
>
Found another interesting post about LVM perf with snapshots.

http://www.mysqlperformanceblog.com/2009/02/05/disaster-lvm-performance-in-snapshot-mode/




-- 
Romeo Theriault

[-- Attachment #2: Type: text/html, Size: 1364 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-15 17:52         ` chris (fool) mccraw
  2010-11-15 18:04           ` Romeo Theriault
@ 2010-11-15 18:08           ` Joe Pruett
  2010-11-15 18:18             ` chris (fool) mccraw
  2010-11-15 23:51           ` Stuart D. Gathman
  2 siblings, 1 reply; 17+ messages in thread
From: Joe Pruett @ 2010-11-15 18:08 UTC (permalink / raw)
  To: LVM general discussion and development


>> BTW, rewriting that 1G file would be normal speed, since
>> the modified chunks have already been copied to the snapshot.
> i'd think that, and you'd think that, but it is not the case.  most of
> my tests were done by rewriting the file 4x, and while the snap %used
> (monitored with the 'lvs' command) doesn't keep going up, performance
> stays the same.
>
i did similar tests and i saw that even though i was rewriting the file,
it appeared that the ext3 layer was reallocating new blocks.  the
snapshot usage would increase to show 3x the number of blocks i had
thought it should have touched, and then it seemed to stay stable.  this
is one of the problems with a block level (as opposed to file level)
snapshot.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-15 18:08           ` Joe Pruett
@ 2010-11-15 18:18             ` chris (fool) mccraw
  0 siblings, 0 replies; 17+ messages in thread
From: chris (fool) mccraw @ 2010-11-15 18:18 UTC (permalink / raw)
  To: LVM general discussion and development

On Mon, Nov 15, 2010 at 10:08, Joe Pruett <joey@q7.com> wrote:
>
>>> BTW, rewriting that 1G file would be normal speed, since
>>> the modified chunks have already been copied to the snapshot.
>> i'd think that, and you'd think that, but it is not the case. �most of
>> my tests were done by rewriting the file 4x, and while the snap %used
>> (monitored with the 'lvs' command) doesn't keep going up, performance
>> stays the same.
>>
> i did similar tests and i saw that even though i was rewriting the file,
> it appeared that the ext3 layer was reallocating new blocks. �the
> snapshot usage would increase to show 3x the number of blocks i had
> thought it should have touched, and then it seemed to stay stable. �this
> is one of the problems with a block level (as opposed to file level)
> snapshot.

interestingly, xfs does not show the usage increase as sharply
(IIRC--i stopped checking that number after a few runs), but neither
is the performance better.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-15 17:52         ` chris (fool) mccraw
  2010-11-15 18:04           ` Romeo Theriault
  2010-11-15 18:08           ` Joe Pruett
@ 2010-11-15 23:51           ` Stuart D. Gathman
  2010-11-16  0:09             ` chris (fool) mccraw
  2 siblings, 1 reply; 17+ messages in thread
From: Stuart D. Gathman @ 2010-11-15 23:51 UTC (permalink / raw)
  To: LVM general discussion and development

On Mon, 15 Nov 2010, chris (fool) mccraw wrote:

> > BTW, rewriting that 1G file would be normal speed, since
> > the modified chunks have already been copied to the snapshot.
> 
> i'd think that, and you'd think that, but it is not the case.  most of
> my tests were done by rewriting the file 4x, and while the snap %used
> (monitored with the 'lvs' command) doesn't keep going up, performance
> stays the same.

Are you writing to the snapshot or the origin?  If writing to the
snapshot, and if your snap% is stable, then you are getting the addition seek
time to jump over to the COW for those sectors.

Once the COW has the copy of the original data for a chunk, then reads/writes
to that chunk on the origin should be identical to reads/writes without the
snapshot, except for some minor CPU overhead.

Another possibility is that while the snap% is not visibly increasing, you
are in fact updating new areas with each test.

-- 
	      Stuart D. Gathman <stuart@bmsi.com>
    Business Management Systems Inc.  Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flammis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-15 23:51           ` Stuart D. Gathman
@ 2010-11-16  0:09             ` chris (fool) mccraw
  0 siblings, 0 replies; 17+ messages in thread
From: chris (fool) mccraw @ 2010-11-16  0:09 UTC (permalink / raw)
  To: LVM general discussion and development

On Mon, Nov 15, 2010 at 15:51, Stuart D. Gathman <stuart@bmsi.com> wrote:
> On Mon, 15 Nov 2010, chris (fool) mccraw wrote:
>
>> > BTW, rewriting that 1G file would be normal speed, since
>> > the modified chunks have already been copied to the snapshot.
>>
>> i'd think that, and you'd think that, but it is not the case. �most of
>> my tests were done by rewriting the file 4x, and while the snap %used
>> (monitored with the 'lvs' command) doesn't keep going up, performance
>> stays the same.
>
> Are you writing to the snapshot or the origin? �If writing to the
> snapshot, and if your snap% is stable, then you are getting the addition seek
> time to jump over to the COW for those sectors.

I've never written to the snapshot, or even mounted it except to prove
that i could...


> Once the COW has the copy of the original data for a chunk, then reads/writes
> to that chunk on the origin should be identical to reads/writes without the
> snapshot, except for some minor CPU overhead.

i don't doubt that, but i don't even want to write to a snapshot.  i
want to write to the primary, and only occasionally even read from the
snapshot.  it is the writes to the primary that i have been
documenting.


> Another possibility is that while the snap% is not visibly increasing, you
> are in fact updating new areas with each test.

and it's also possible i've been misremembering about the snap% not
growing.  in fact in a quick test, it did grow with each rewrite.
perhaps i am remembering a test on an ext3 parent, whereas i am now
using xfs.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-13  0:17     ` chris (fool) mccraw
  2010-11-13  0:58       ` Stuart D Gathman
@ 2010-11-15 18:05       ` chris (fool) mccraw
  1 sibling, 0 replies; 17+ messages in thread
From: chris (fool) mccraw @ 2010-11-15 18:05 UTC (permalink / raw)
  To: LVM general discussion and development

On Fri, Nov 12, 2010 at 16:17, chris (fool) mccraw <gently@gmail.com> wrote:

> interestingly the default snapshot chunk size on my system:
>
> �LVM version: � � 2.02.56(1)-RHEL5 (2010-04-22)
> �Library version: 1.02.39-RHEL5 (2010-04-22)
> �Driver version: �4.11.5
>
> is 4k. �a tutorial i was reading suggested it was 64k, and i didn't
> doublecheck if that was true. �i am going to have to wait til after
> business hours to run more thorough tests, but i still see a slowdown
> way over 10x even at 64k chunk size with a single snapshot. �i'll try
> it at all the different available chunk sizes and report back by
> monday.

Well, I only made it through most chunk sizes.  here were my results:

(previously obtained:

no snapshot             = ~11sec (727MB/sec)
1 snapshot @4k chunks   = ~102sec (78MB/sec)

)

newly obtained:
1 snapshot @16k chunks  = ~639sec (12MB/sec)
1 snapshot @32k chunks  = ~367sec (21MB/sec)
1 snapshot @64k chunks  = ~252sec (31MB/sec)
1 snapshot @256k chunks = ~145sec (55MB/sec)
1 snapshot @512k chunks = ~100sec (80MB/sec)

wish i'd tried 8kb, since that's the most interesting area of the
graph now that i lay the numbers out together.  but no matter what
size i use, it's still nearly a factor of 10, and my numbers with the
default 4k chunks were almost as good as it gets.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-12 21:52 [linux-lvm] advice for curing terrible snapshot performance? chris (fool) mccraw
  2010-11-12 22:28 ` Joe Pruett
@ 2010-11-15 14:35 ` Romeo Theriault
  2010-11-15 17:46   ` chris (fool) mccraw
  2010-11-15 20:37 ` Stephane Chazelas
  2 siblings, 1 reply; 17+ messages in thread
From: Romeo Theriault @ 2010-11-15 14:35 UTC (permalink / raw)
  To: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 1097 bytes --]

On Sat, Nov 13, 2010 at 06:52, chris (fool) mccraw <gently@gmail.com> wrote:

however, snapshots really shoot my system in the foot.  in my (12
> processor, 12GB RAM, x86_64 centos 5.5) server, i have two pricey
> areca hardware raid cards that give me ridiculous write performance:
> i can sustain over 700MByte/sec write speeds (writing a file with dd
> if=/dev/zero bs=1M, twice the size of the raid card's onboard memory
> and timing the write + a sync afterwards) to the volumes on either
> card (both backed by 7 or more fast disks).  enabling a single
> snapshot reduces that speed by a factor of 10!  enabling more
> snapshots isn't as drastic but still doesn't really scale well:
>


Once you have snapshots and are doing new dd's are you creating new files or
dd'ing over the old file?


> no snapshot  = ~11sec (727MB/sec)
> 1 snapshot   = ~102sec (78MB/sec)
> 2 snapshots  = ~144sec (55MB/sec)
> 3 snapshots  = ~313sec (25MB/sec)
> 4 snapshots  = ~607sec (15MB/sec)
>

I would assume a "yes", but does your write performance return to normal
after you remove all the snapshots?

Romeo

[-- Attachment #2: Type: text/html, Size: 1642 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-15 14:35 ` Romeo Theriault
@ 2010-11-15 17:46   ` chris (fool) mccraw
  0 siblings, 0 replies; 17+ messages in thread
From: chris (fool) mccraw @ 2010-11-15 17:46 UTC (permalink / raw)
  To: LVM general discussion and development

On Mon, Nov 15, 2010 at 06:35, Romeo Theriault
<romeo.theriault@maine.edu> wrote:
>
>
> On Sat, Nov 13, 2010 at 06:52, chris (fool) mccraw <gently@gmail.com> wrote:
>>
>> however, snapshots really shoot my system in the foot. �in my (12
>> processor, 12GB RAM, x86_64 centos 5.5) server, i have two pricey
>> areca hardware raid cards that give me ridiculous write performance:
>> i can sustain over 700MByte/sec write speeds (writing a file with dd
>> if=/dev/zero bs=1M, twice the size of the raid card's onboard memory
>> and timing the write + a sync afterwards) to the volumes on either
>> card (both backed by 7 or more fast disks). �enabling a single
>> snapshot reduces that speed by a factor of 10! �enabling more
>> snapshots isn't as drastic but still doesn't really scale well:
>
>
> Once you have snapshots and are doing new dd's are you creating new files or
> dd'ing over the old file?

tried both ways with no obvious performance difference (like i said,
sometimes things go up to 30% faster or slower for 1 run, but that's
happened doing it both ways).

>> no snapshot �= ~11sec (727MB/sec)
>> 1 snapshot � = ~102sec (78MB/sec)
>> 2 snapshots �= ~144sec (55MB/sec)
>> 3 snapshots �= ~313sec (25MB/sec)
>> 4 snapshots �= ~607sec (15MB/sec)
>
> I would assume a "yes", but does your write performance return to normal
> after you remove all the snapshots?

absolutely, it does.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-12 21:52 [linux-lvm] advice for curing terrible snapshot performance? chris (fool) mccraw
  2010-11-12 22:28 ` Joe Pruett
  2010-11-15 14:35 ` Romeo Theriault
@ 2010-11-15 20:37 ` Stephane Chazelas
  2010-11-15 22:57   ` Stuart D. Gathman
  2 siblings, 1 reply; 17+ messages in thread
From: Stephane Chazelas @ 2010-11-15 20:37 UTC (permalink / raw)
  To: LVM general discussion and development

2010-11-12 13:52:25 -0800, chris (fool) mccraw:
[...]
> no snapshot  = ~11sec (727MB/sec)
> 1 snapshot   = ~102sec (78MB/sec)
> 2 snapshots  = ~144sec (55MB/sec)
> 3 snapshots  = ~313sec (25MB/sec)
> 4 snapshots  = ~607sec (15MB/sec)
[...]

Yes, from 2 snapshots on a volume LVM snapshots become unusable.
Other issues you may have with LVM snapshots with different
versions: some grub versions failing to see LVM volumes once
they have a snapshot, activating a volume group taking several
minutes when there are quite filled up volumes.

ddsnap (zumastor) worked great but is no longer maintained.

So, you have:

- LVM snapshots: poor performace or very few snapshots
- ddsnap: got to stick with an old kernel
- btrfs, not really production ready yet. Probably your best bet
though.
- fuse solutions, with all the fuse related issues.

Quite frustrating as the technolgy is here and has been around
for years.

Sorry for not bringing any good news. I'm all ears for good
alternative solutions though.

Another solution I had looked at at some point:

- qemu-nbd with qcow2: performance not there and a few issues as
well.

-- 
Stephane

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [linux-lvm] advice for curing terrible snapshot performance?
  2010-11-15 20:37 ` Stephane Chazelas
@ 2010-11-15 22:57   ` Stuart D. Gathman
  0 siblings, 0 replies; 17+ messages in thread
From: Stuart D. Gathman @ 2010-11-15 22:57 UTC (permalink / raw)
  To: LVM general discussion and development

On Mon, 15 Nov 2010, Stephane Chazelas wrote:

> ddsnap (zumastor) worked great but is no longer maintained.
> 
> - LVM snapshots: poor performace or very few snapshots
> - ddsnap: got to stick with an old kernel

zumastor doesn't depend on kernel versions - it uses device-mapper just
like regular snapshots.  (Or does it have a special device-mapper
kernel module?)

> - btrfs, not really production ready yet. Probably your best bet
> though.

This is good too, but at a different level.  We still need LV snapshot.

> - fuse solutions, with all the fuse related issues.

-- 
	      Stuart D. Gathman <stuart@bmsi.com>
    Business Management Systems Inc.  Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flammis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2010-11-16  0:09 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-12 21:52 [linux-lvm] advice for curing terrible snapshot performance? chris (fool) mccraw
2010-11-12 22:28 ` Joe Pruett
2010-11-12 23:30   ` chris (fool) mccraw
2010-11-12 23:36   ` Joe Pruett
2010-11-13  0:17     ` chris (fool) mccraw
2010-11-13  0:58       ` Stuart D Gathman
2010-11-15 17:52         ` chris (fool) mccraw
2010-11-15 18:04           ` Romeo Theriault
2010-11-15 18:08           ` Joe Pruett
2010-11-15 18:18             ` chris (fool) mccraw
2010-11-15 23:51           ` Stuart D. Gathman
2010-11-16  0:09             ` chris (fool) mccraw
2010-11-15 18:05       ` chris (fool) mccraw
2010-11-15 14:35 ` Romeo Theriault
2010-11-15 17:46   ` chris (fool) mccraw
2010-11-15 20:37 ` Stephane Chazelas
2010-11-15 22:57   ` Stuart D. Gathman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).