From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx1.redhat.com (ext-mx02.extmail.prod.ext.phx2.redhat.com [10.5.110.6]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id oBV3h4GI015502 for ; Thu, 30 Dec 2010 22:43:04 -0500 Received: from BLADE3.ISTI.CNR.IT (blade3.isti.cnr.it [194.119.192.19]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id oBV3grN0015841 for ; Thu, 30 Dec 2010 22:42:53 -0500 Received: from SCRIPT-SPFWL-DAEMON.mx.isti.cnr.it by mx.isti.cnr.it (PMDF V6.5-x5 #31825) id <01NW23CT2TY8NG5X4A@mx.isti.cnr.it> for linux-lvm@redhat.com; Fri, 31 Dec 2010 04:41:48 +0100 (MET) Received: from conversionlocal.isti.cnr.it by mx.isti.cnr.it (PMDF V6.5-x5 #31825) id <01NW23CSHOXSNG5X49@mx.isti.cnr.it> for linux-lvm@redhat.com; Fri, 31 Dec 2010 04:41:47 +0100 (MET) Received: from [10.0.0.61] (dynamic-adsl-78-12-75-177.clienti.tiscali.it [78.12.75.177]) by mx.isti.cnr.it (PMDF V6.5-x5 #31826) with ESMTPSA id <01NW23CRRY3ANRB8YK@mx.isti.cnr.it> for linux-lvm@redhat.com; Fri, 31 Dec 2010 04:41:46 +0100 (MET) Date: Fri, 31 Dec 2010 04:41:55 +0100 From: Spelic In-reply-to: Message-id: <4D1D5103.6030706@shiftmail.org> MIME-version: 1.0 Content-transfer-encoding: 7bit References: <4D1A9FAF.6050401@shiftmail.org> <4D1B3F6A.4070309@shiftmail.org> <4D1BF8DF.20908@shiftmail.org> Subject: Re: [linux-lvm] pvmove painfully slow on parity RAID Reply-To: LVM general discussion and development List-Id: LVM general discussion and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , List-Id: Content-Type: text/plain; format="flowed"; charset="us-ascii" To: linux-lvm@redhat.com On 12/30/2010 08:12 PM, Stuart D. Gathman wrote: > On Thu, 30 Dec 2010, Spelic wrote: > > >> Also there is still the mystery of why times appear *randomly* related to the >> number of devices, chunk sizes, and stripe sizes! if the rmw cycle was the >> culprit, how come I see: >> raid5, 4 devices, 16384k chunk: 41sec (4.9MB/sec) >> raid5, 6 device, 4096k chunk: 2m18sec ?!?! (1.44 MB/sec!?) >> the first has much larger stripe size of 49152K , the second has 20480K ! >> > Ok, next theory. Pvmove works by allocating a mirror for each > contiguous segment of the source LV, update metadata Ok never mind, I found the problem: LVM probably uses O_DIRECT, right? Well it's absymally slow on MD parity raid (I checked with dd on the bare MD device just now) and I don't know why it's so slow. It's not because of the rmw because it's slow even the second time I try, when it does not read anything anymore because all reads are in cache already. I understand this is probably to be fixed at MD side (and I will report the problem to linux-raid, but I see it has already been discussed without much results) However... ...is there any chance you might fix it at lvm side too, changing LVM to use nondirect IO so to "support" MD? In my raid5 array between direct and nondirect (dd bs=1M or smaller) there's the difference of 2.1MB/s to 250MB/sec, and would probably be greater on larger arrays. Also in raid10 nondirect is much faster for small transfer sizes like bs=4K (28MB/sec to 160MB/sec) but not at 1M, however LVM probably uses low transfer sizes, right? Thank you