linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] lvmcache in writeback mode
@ 2014-12-31 11:23 Pim van den Berg
  2015-01-05  9:35 ` Joe Thornber
  0 siblings, 1 reply; 3+ messages in thread
From: Pim van den Berg @ 2014-12-31 11:23 UTC (permalink / raw)
  To: linux-lvm

Hi,

Since a couple of days I switched from bcache to lvmcache, running a 
Linux 3.17.7 kernel. I was surprised that it was so easy to setup 
lvmcache. :)

The system is a hypervisor and NFS-server. The LV that is used by the 
NFS-server is 1TB, 35GB SSD cache is attached (1GB metadata). One of the 
VMs runs a collectd (http://collectd.org/) server, which reads/writes a 
lot of RRD files via NFS on the LV that uses lvmcache.

My experience with bcache was that the RRD files were always in the SSD 
cache, because they were used so often, which was great! With bcache in 
writethrough mode the collectd VM had an average of 8-10% Wait-IO, 
because it had to wait until writes were written to the HDD. bcache in 
writeback mode resulted in ~1% Wait-IO on the VM. The writeback cache 
made writes very fast.

Now I switched to lvmcache. This is the output of dmsetup status:
0 2147483648 cache 8 2246/262144 128 135644/573440 366608 166900 7866816 
295290 0 127321 0 1 writeback 2 migration_threshold 2048 mq 10 
random_threshold 4 sequential_threshold 512 discard_promote_adjustment 1 
read_promote_adjustment 0 write_promote_adjustment 0

As you can see I set read_promote_adjustment and 
write_promote_adjustment to 0.

I created a collectd plugin to monitor the lvmcache usage:
https://github.com/pommi/collectd-lvmcache

Here are the results of the past 2 hours:
http://pommi.nethuis.nl/wp-content/uploads/2014/12/lvmcache-usage.png
http://pommi.nethuis.nl/wp-content/uploads/2014/12/lvmcache-stats.png

The 2nd link shows you that there are many "Write hits". You can almost 
1-on-1 map these to this graph, which shows the eth0 network packets 
(NFS traffic) on the collectd VM:
http://pommi.nethuis.nl/wp-content/uploads/2014/12/lvmcache-vm-networkpackets.png

So I think the conclusion is that the lvmcache writeback is used quite 
well for caching the collectd RRDs.

But... when I look at the CPU usage of the VM there is 8-10% Wait-IO 
(this also matches the 2 graphs mentioned above almost 1-on-1):
http://pommi.nethuis.nl/wp-content/uploads/2014/12/lvmcache-vm-load.png

This is equal to having no SSD cache at all or bcache in writethrough 
mode. I was expecting ~1% Wait-IO.

How can this be explained?

 From the stats its clear that the pattern of "Network Packets", being 
NFS traffic, matches the lvmcache "Write hits" pattern. Does lvmcache in 
writeback mode still wait for its data to be written to the HDD? Does 
"Write hits" mean something different? Is "dmsetup status" giving me 
wrong information? Or do I still have to set some lvmcache settings to 
make this work as expected?

-- 
Regards,
Pim

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [linux-lvm] lvmcache in writeback mode
@ 2015-01-01 11:20 Pim van den Berg
  0 siblings, 0 replies; 3+ messages in thread
From: Pim van den Berg @ 2015-01-01 11:20 UTC (permalink / raw)
  To: linux-lvm

Hi,

Since a couple of days I switched from bcache to lvmcache, running a
Linux 3.17.7 kernel. I was surprised that it was so easy to setup
lvmcache compared to bcache. :)

The system is a hypervisor and NFS-server. The LV that is used by the
NFS-server is 1TB, 35GB SSD cache is attached (1GB metadata). One of the
VMs runs a collectd (http://collectd.org/) server, which reads/writes a
lot of RRD files via NFS on the LV that uses lvmcache.

My experience with bcache was that the RRD files were always in the SSD
cache, because they were used so often, which was great! With bcache in
writethrough mode the collectd VM had an average of 8-10% Wait-IO,
because it had to wait until writes were written to the HDD. bcache in
writeback mode resulted in ~1% Wait-IO on the VM. The writeback cache
made writes very fast.

Now I switched to lvmcache. This is the output of dmsetup status:
0 2147483648 cache 8 2246/262144 128 135644/573440 366608 166900 7866816
295290 0 127321 0 1 writeback 2 migration_threshold 2048 mq 10
random_threshold 4 sequential_threshold 512 discard_promote_adjustment 1
read_promote_adjustment 0 write_promote_adjustment 0

As you can see I set read_promote_adjustment and
write_promote_adjustment to 0.

I created a collectd plugin to monitor the lvmcache usage:
https://github.com/pommi/collectd-lvmcache

Here are the results of a 2 hour time-span:
http://pommi.nethuis.nl/wp-content/uploads/2014/12/lvmcache-usage.png
http://pommi.nethuis.nl/wp-content/uploads/2014/12/lvmcache-stats.png

The 2nd link shows you that there are many "Write hits". You can almost
1-on-1 map these to this graph, which shows the eth0 network packets
(NFS traffic) on the collectd VM:
http://pommi.nethuis.nl/wp-content/uploads/2014/12/lvmcache-vm-networkpackets.png

So I think the conclusion is that the lvmcache writeback is used quite
well for caching the collectd RRDs.

But... when I look at the CPU usage of the VM there is 8-10% Wait-IO
(this also matches the 2 graphs mentioned above almost 1-on-1):
http://pommi.nethuis.nl/wp-content/uploads/2014/12/lvmcache-vm-load.png

This is equal to having no SSD cache at all or bcache in writethrough
mode. I was expecting ~1% Wait-IO.

How can this be explained?

From the stats its clear that the pattern of "Network Packets", being
NFS traffic, matches the lvmcache "Write hits" pattern. Does lvmcache in
writeback mode still wait for its data to be written to the HDD? Does
"Write hits" mean something different? Is "dmsetup status" giving me
wrong information? Or do I still have to set some lvmcache settings to
make this work as expected?

-- 
Regards,
Pim

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [linux-lvm] lvmcache in writeback mode
  2014-12-31 11:23 [linux-lvm] lvmcache in writeback mode Pim van den Berg
@ 2015-01-05  9:35 ` Joe Thornber
  0 siblings, 0 replies; 3+ messages in thread
From: Joe Thornber @ 2015-01-05  9:35 UTC (permalink / raw)
  To: LVM general discussion and development

On Wed, Dec 31, 2014 at 12:23:00PM +0100, Pim van den Berg wrote:
> But... when I look at the CPU usage of the VM there is 8-10% Wait-IO
> (this also matches the 2 graphs mentioned above almost 1-on-1):
> http://pommi.nethuis.nl/wp-content/uploads/2014/12/lvmcache-vm-load.png
> 
> This is equal to having no SSD cache at all or bcache in
> writethrough mode. I was expecting ~1% Wait-IO.
> 
> How can this be explained?
> 
> From the stats its clear that the pattern of "Network Packets",
> being NFS traffic, matches the lvmcache "Write hits" pattern. Does
> lvmcache in writeback mode still wait for its data to be written to
> the HDD? Does "Write hits" mean something different? Is "dmsetup
> status" giving me wrong information? Or do I still have to set some
> lvmcache settings to make this work as expected?

I think your expectations of writeback mode are correct, but to spell
it out here some pseudo code.

In writeback mode:

   if block is on ssd
      write to ssd, complete bio once written
      increment write hit counter
   else
      write to origin and complete
      increment write miss counter

writethrough mode:

   if block on ssd
      write to ssd, then origin, complete
      increment write hit counter
   else
      write to origin and complete
      increment write miss counter


Some things that can slow down IOs:

- Changing a mapping due to the promotion or demotion of a block
  requires and metadata commit. (Check LVM2 has put the metadata on
  the ssd rather than spindle).

- REQ_DISCARD.  This is an expensive operation.  I advise people to
  periodically use fstrim rather than having the fs do it
  automatically when it deletes files.

- Background writeback IO could possibly be interferring with incoming
  writes.  eg, if a dirty block is being written back when a write to
  that block comes in then the write will be stalled.  Looking at the
  code I can see we're being very agressive about writing everything
  back irrespective of how recently the block was hit.  It would be
  trivial to change it to only writeback after a number of policy
  'ticks'.  I'll do some experiments ...

- Joe

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-01-05  9:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-31 11:23 [linux-lvm] lvmcache in writeback mode Pim van den Berg
2015-01-05  9:35 ` Joe Thornber
  -- strict thread matches above, loose matches on Subject: below --
2015-01-01 11:20 Pim van den Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).