From: Mike Snitzer <snitzer@redhat.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: elder@inktank.com, Sage Weil <sage@inktank.com>,
Ugis <ugis22@gmail.com>,
linux-lvm@redhat.com,
"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>,
"ceph-users@ceph.com" <ceph-users@ceph.com>
Subject: Re: [linux-lvm] poor read performance on rbd+LVM, LVM overload
Date: Mon, 21 Oct 2013 11:06:05 -0400 [thread overview]
Message-ID: <20131021150605.GB28099@redhat.com> (raw)
In-Reply-To: <20131021150129.GA28099@redhat.com>
On Mon, Oct 21 2013 at 11:01am -0400,
Mike Snitzer <snitzer@redhat.com> wrote:
> On Mon, Oct 21 2013 at 10:11am -0400,
> Christoph Hellwig <hch@infradead.org> wrote:
>
> > On Sun, Oct 20, 2013 at 08:58:58PM -0700, Sage Weil wrote:
> > > It looks like without LVM we're getting 128KB requests (which IIRC is
> > > typical), but with LVM it's only 4KB. Unfortunately my memory is a bit
> > > fuzzy here, but I seem to recall a property on the request_queue or device
> > > that affected this. RBD is currently doing
> >
> > Unfortunately most device mapper modules still split all I/O into 4k
> > chunks before handling them. They rely on the elevator to merge them
> > back together down the line, which isn't overly efficient but should at
> > least provide larger segments for the common cases.
>
> It isn't DM that splits the IO into 4K chunks; it is the VM subsystem
> no? Unless care is taken to assemble larger bios (higher up the IO
> stack, e.g. in XFS), all buffered IO will come to bio-based DM targets
> in $PAGE_SIZE granularity.
>
> I would expect direct IO to before better here because it will make use
> of bio_add_page to build up larger IOs.
s/before/perform/ ;)
> Taking a step back, the rbd driver is exposing both the minimum_io_size
> and optimal_io_size as 4M. This symmetry will cause XFS to _not_ detect
> the exposed limits as striping. Therefore, AFAIK, XFS won't take steps
> to respect the limits when it assembles its bios (via bio_add_page).
>
> Sage, any reason why you don't use traditional raid geomtry based IO
> limits?, e.g.:
>
> minimum_io_size = raid chunk size
> optimal_io_size = raid chunk size * N stripes (aka full stripe)
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
next prev parent reply other threads:[~2013-10-21 15:06 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-16 14:46 [linux-lvm] poor read performance on rbd+LVM, LVM overload Ugis
2013-10-16 16:16 ` Sage Weil
2013-10-17 9:06 ` David McBride
2013-10-17 15:18 ` Mike Snitzer
2013-10-18 7:56 ` Ugis
2013-10-19 0:01 ` Sage Weil
2013-10-20 15:18 ` Ugis
2013-10-20 18:21 ` [linux-lvm] [ceph-users] " Josh Durgin
2013-10-21 3:58 ` [linux-lvm] " Sage Weil
2013-10-21 14:11 ` Christoph Hellwig
2013-10-21 15:01 ` Mike Snitzer
2013-10-21 15:06 ` Mike Snitzer [this message]
2013-10-21 16:02 ` Sage Weil
2013-10-21 17:48 ` Mike Snitzer
2013-10-21 18:05 ` Sage Weil
2013-10-21 18:06 ` Christoph Hellwig
2013-10-21 18:27 ` Mike Snitzer
2013-10-30 14:53 ` Ugis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131021150605.GB28099@redhat.com \
--to=snitzer@redhat.com \
--cc=ceph-devel@vger.kernel.org \
--cc=ceph-users@ceph.com \
--cc=elder@inktank.com \
--cc=hch@infradead.org \
--cc=linux-lvm@redhat.com \
--cc=sage@inktank.com \
--cc=ugis22@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).