From mboxrd@z Thu Jan 1 00:00:00 1970 From: Josh Durgin Subject: Re: poor read performance on rbd+LVM, LVM overload Date: Sun, 20 Oct 2013 11:21:24 -0700 Message-ID: <52641F24.6000406@inktank.com> References: <20131017151828.GB28859@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ceph-users-bounces-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org Sender: ceph-users-bounces-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org To: Ugis , Sage Weil Cc: "ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "ceph-users-Qp0mS5GaXlQ@public.gmane.org" , Mike Snitzer , linux-lvm-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org List-Id: ceph-devel.vger.kernel.org On 10/20/2013 08:18 AM, Ugis wrote: >>> output follows: >>> #pvs -o pe_start /dev/rbd1p1 >>> 1st PE >>> 4.00m >>> # cat /sys/block/rbd1/queue/minimum_io_size >>> 4194304 >>> # cat /sys/block/rbd1/queue/optimal_io_size >>> 4194304 >> >> Well, the parameters are being set at least. Mike, is it possible that >> having minimum_io_size set to 4m is causing some read amplification >> in LVM, translating a small read into a complete fetch of the PE (or >> somethinga long those lines)? >> >> Ugis, if your cluster is on the small side, it might be interesting to see >> what requests the client is generated in the LVM and non-LVM case by >> setting 'debug ms = 1' on the osds (e.g., ceph tell osd.* injectargs >> '--debug-ms 1') and then looking at the osd_op messages that appear in >> /var/log/ceph/ceph-osd*.log. It may be obvious that the IO pattern is >> different. >> > Sage, here follows debug output. I am no pro in reading this, but > seems read block size differ(or what is that number following ~ sign)? Yes, that's the I/O length. LVM is sending requests for 4k at a time, while plain kernel rbd is sending 128k. > How to proceed with tuning read performance on LVM? Is there some > chanage needed in code of ceph/LVM or my config needs to be tuned? > If what is shown in logs means 4k read block in LVM case - then it > seems I need to tell LVM(or xfs on top of LVM dictates read block > side?) that io block should be rather 4m? It's a client side issue of sending much smaller requests than it needs to. Check the queue minimum and optimal sizes for the lvm device - it sounds like they might be getting set to 4k for some reason. Josh From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx1.redhat.com (ext-mx14.extmail.prod.ext.phx2.redhat.com [10.5.110.19]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r9KILZql014262 for ; Sun, 20 Oct 2013 14:21:35 -0400 Received: from mail-ob0-f175.google.com (mail-ob0-f175.google.com [209.85.214.175]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r9KILYug001189 for ; Sun, 20 Oct 2013 14:21:34 -0400 Received: by mail-ob0-f175.google.com with SMTP id uz6so4973012obc.34 for ; Sun, 20 Oct 2013 11:21:34 -0700 (PDT) Message-ID: <52641F24.6000406@inktank.com> Date: Sun, 20 Oct 2013 11:21:24 -0700 From: Josh Durgin MIME-Version: 1.0 References: <20131017151828.GB28859@redhat.com> In-Reply-To: Content-Transfer-Encoding: 7bit Subject: Re: [linux-lvm] [ceph-users] poor read performance on rbd+LVM, LVM overload Reply-To: LVM general discussion and development List-Id: LVM general discussion and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , List-Id: Content-Type: text/plain; charset="us-ascii"; format="flowed" To: Ugis , Sage Weil Cc: "ceph-devel@vger.kernel.org" , "ceph-users@ceph.com" , Mike Snitzer , linux-lvm@redhat.com On 10/20/2013 08:18 AM, Ugis wrote: >>> output follows: >>> #pvs -o pe_start /dev/rbd1p1 >>> 1st PE >>> 4.00m >>> # cat /sys/block/rbd1/queue/minimum_io_size >>> 4194304 >>> # cat /sys/block/rbd1/queue/optimal_io_size >>> 4194304 >> >> Well, the parameters are being set at least. Mike, is it possible that >> having minimum_io_size set to 4m is causing some read amplification >> in LVM, translating a small read into a complete fetch of the PE (or >> somethinga long those lines)? >> >> Ugis, if your cluster is on the small side, it might be interesting to see >> what requests the client is generated in the LVM and non-LVM case by >> setting 'debug ms = 1' on the osds (e.g., ceph tell osd.* injectargs >> '--debug-ms 1') and then looking at the osd_op messages that appear in >> /var/log/ceph/ceph-osd*.log. It may be obvious that the IO pattern is >> different. >> > Sage, here follows debug output. I am no pro in reading this, but > seems read block size differ(or what is that number following ~ sign)? Yes, that's the I/O length. LVM is sending requests for 4k at a time, while plain kernel rbd is sending 128k. > How to proceed with tuning read performance on LVM? Is there some > chanage needed in code of ceph/LVM or my config needs to be tuned? > If what is shown in logs means 4k read block in LVM case - then it > seems I need to tell LVM(or xfs on top of LVM dictates read block > side?) that io block should be rather 4m? It's a client side issue of sending much smaller requests than it needs to. Check the queue minimum and optimal sizes for the lvm device - it sounds like they might be getting set to 4k for some reason. Josh