linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
From: David McBride <dwm37@cam.ac.uk>
To: Sage Weil <sage@inktank.com>
Cc: ugis22@gmail.com,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>,
	"ceph-users@ceph.com" <ceph-users@ceph.com>,
	linux-lvm@redhat.com
Subject: Re: [linux-lvm] poor read performance on rbd+LVM, LVM overload
Date: Thu, 17 Oct 2013 10:06:55 +0100	[thread overview]
Message-ID: <525FA8AF.1010408@cam.ac.uk> (raw)
In-Reply-To: <alpine.DEB.2.00.1310160914360.22271@cobra.newdream.net>

On 16/10/2013 17:16, Sage Weil wrote:

> I'm not sure what options LVM provides for aligning things to the
> underlying storage...

There is a generic kernel ABI for exposing performance properties of 
block devices to higher layers, so that they can automatically tune 
themselves according to those performance properties, and report their 
performance properties to users higher up the stack.

LVM supports both reading this data from underlying physical devices, 
configuring itself as appropriate --- as well as reporting this data to 
users of LVs, so that they can, too.

(For example, mkfs.xfs uses libblkid to automatically select the optimal 
stripe-size, stride width, etc. of an LVM volume sitting on top of an MD 
disk array.)

A good starting point appears to be:

 
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c72758f33784e5e2a1a4bb9421ef3e6de8f9fcf3

If Ceph RBD block devices don't currently expose this information, that 
should be a relatively simple addition that will result in all higher 
layers, whether LVM or a native filesystem, automatically tuning 
themselves at creation-time for the RBD's performance characteristics.

(As an aside, it's possible that OSD journalling performance could also 
be improved by teaching it to heed this topology information.  I can 
imagine that when writing directly to block devices it may be possible 
to improve performance, such as when using LVM-on-an-SSD, or a DOS 
partition on a 4k-sector SATA disk.)

  ~ ~ ~

In the mean time, the documentation I found for LVM2 suggests that the 
`pvcreate` command supports the "--dataalignment" and 
"--dataalignmentoffset" flags.

The former should be the RBD object size, e.g. 4MB by default.  In this 
case, you'll also need to set the latter compensate for the offset 
introduced by the GPT place-holder partition table at the start of the 
device so that LVM data extents begin on an object boundry.

Cheers,
David
-- 
David McBride <dwm37@cam.ac.uk>
Unix Specialist, University Computing Service

  reply	other threads:[~2013-10-17  9:07 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-16 14:46 [linux-lvm] poor read performance on rbd+LVM, LVM overload Ugis
2013-10-16 16:16 ` Sage Weil
2013-10-17  9:06   ` David McBride [this message]
2013-10-17 15:18   ` Mike Snitzer
2013-10-18  7:56     ` Ugis
2013-10-19  0:01       ` Sage Weil
2013-10-20 15:18         ` Ugis
2013-10-20 18:21           ` [linux-lvm] [ceph-users] " Josh Durgin
2013-10-21  3:58           ` [linux-lvm] " Sage Weil
2013-10-21 14:11             ` Christoph Hellwig
2013-10-21 15:01               ` Mike Snitzer
2013-10-21 15:06                 ` Mike Snitzer
2013-10-21 16:02                 ` Sage Weil
2013-10-21 17:48                   ` Mike Snitzer
2013-10-21 18:05                     ` Sage Weil
2013-10-21 18:06                 ` Christoph Hellwig
2013-10-21 18:27                   ` Mike Snitzer
2013-10-30 14:53                     ` Ugis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=525FA8AF.1010408@cam.ac.uk \
    --to=dwm37@cam.ac.uk \
    --cc=ceph-devel@vger.kernel.org \
    --cc=ceph-users@ceph.com \
    --cc=linux-lvm@redhat.com \
    --cc=sage@inktank.com \
    --cc=ugis22@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).